Jun 13, 2019

HTML page parsing with xmllint xpath in BASH


Per HTML_parsers, there is no better HTML page parsing options for BASH. Inspired by Retrieve web using xpath, here comes the summary of using xmllint xpath:


xpath='' # sample: '//div[@class = "tides"]'

get_element_by_xpath():
    echo $HTML_PAGE | xmllint --html --xpath $xpath - 2>/dev/null

get_element_text_by_xpath():
    xpath+='/text()'
    echo $HTML_PAGE | xmllint --html --xpath $xpath - 2>/dev/null

get_elements_count_by_xpath():
    xpath="count($xpath)"
    echo $HTML_PAGE | xmllint --html --xpath $xpath - 2>/dev/null

No comments: