Sample time diff:
With jq:
real 0m1.666s
user 0m0.262s
sys 0m0.290s
With Python json.tool:
real 0m2.217s
user 0m0.492s
sys 0m0.539s
Useful links:
Summary of my Development and QA Automation work experiences and list useful links/articles of development, automation regression, and load test etc.
import requests from lxml import html pageContent=requests.get('https://en.wikipedia.org/wiki/List_of_Olympic_medalists_in_judo') tree = html.fromstring(pageContent.content) goldWinners=tree.xpath('//*[@id="mw-content-text"]/table/tr/td[2]/a[1]/text()') silverWinners=tree.xpath('//*[@id="mw-content-text"]/table/tr/td[3]/a[1]/text()') #bronzeWinner we need rows where there's no rowspan - note XPath bronzeWinners=tree.xpath('//*[@id="mw-content-text"]/table/tr/td[not(@rowspan=2)]/a[1]/text()') medalWinners=goldWinners+silverWinners+bronzeWinners medalTotals={} for name in medalWinners: if medalTotals.has_key(name): medalTotals[name]=medalTotals[name]+1 else: medalTotals[name]=1 for result in sorted( medalTotals.items(), key=lambda x:x[1],reverse=True): print '%s:%s' % result