v***@sbt.net.au
2014-01-08 23:52:57 UTC
I have a script like:
wget -O page.html url
lynx -dump page.html > page.txt
that worked TILL web server was redeveloped;
now they use html5 stuff, and, page.html has data I want, but, page.txt
only has 'labels' but not data contents, andy thought how I can do
that...?
when displayed on screen, data shows, in text file, not
looking at page.html it has like:
/snip/
<label class="pfbc-label">Suburb</label><input type="text"
name="SYS_Addresses_e_address_i_0_e_district_tx" value="SYDNEY"
readonly="readonly" class="ro pfbc-textbox"/>
<label class="pfbc-label">State</label><input type="hidden" value="NSW"
name="SYS_Addresses_e_address_i_0_e_state_cd"><input type="text"
name="SYS_Addresses_e_address_i_0_e_state_cd_d" value="NSW"
readonly="readonly" class="ro pfbc-textbox"/>
<label class="pfbc-label">Postcode</label><input type="text"
name="SYS_Addresses_e_address_i_0_e_postcode_tx" value="2000"
readonly="readonly" class="ro pfbc-textbox"/>
wget -O page.html url
lynx -dump page.html > page.txt
that worked TILL web server was redeveloped;
now they use html5 stuff, and, page.html has data I want, but, page.txt
only has 'labels' but not data contents, andy thought how I can do
that...?
when displayed on screen, data shows, in text file, not
looking at page.html it has like:
/snip/
<label class="pfbc-label">Suburb</label><input type="text"
name="SYS_Addresses_e_address_i_0_e_district_tx" value="SYDNEY"
readonly="readonly" class="ro pfbc-textbox"/>
<label class="pfbc-label">State</label><input type="hidden" value="NSW"
name="SYS_Addresses_e_address_i_0_e_state_cd"><input type="text"
name="SYS_Addresses_e_address_i_0_e_state_cd_d" value="NSW"
readonly="readonly" class="ro pfbc-textbox"/>
<label class="pfbc-label">Postcode</label><input type="text"
name="SYS_Addresses_e_address_i_0_e_postcode_tx" value="2000"
readonly="readonly" class="ro pfbc-textbox"/>