Extracting data from web pages

VA Smalltalk is a "100% VisualAge compatible" IDE that includes the original VisualAge technology and the popular VA Assist and WidgetKit add-ons.

Moderators: Eric Clayberg, wembley, tc, Diane Engles, solveig

Extracting data from web pages

Postby PhotonDemon » Tue Jun 17, 2008 7:32 am

Hi All,

I'm interested in extracting data from web pages using VA Smalltalk. Has anyone tried this? Any ideas or hints on where to start will be greatly appreciated. Thanks.

Lou
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon
mailto:Lou@Keystone-Software.com http://www.Keystone-Software.com
PhotonDemon
[|]
 
Posts: 176
Joined: Thu Dec 20, 2007 1:45 pm

Re: Extracting data from web pages

Postby tc » Tue Jun 17, 2008 2:43 pm

Hello,

You message is brief, so, at the most basic level, one could do something like:

Code: Select all
'http://www.yahoo.com/' sstAsUrl  fetch


. . . and parse through the HTML that is returned.

--tc
tc
Moderator
 
Posts: 304
Joined: Tue Oct 17, 2006 7:40 am
Location: Raleigh, NC

Re: Extracting data from web pages

Postby PhotonDemon » Wed Jun 18, 2008 5:10 am

tc wrote:Hello,

You message is brief, so, at the most basic level, one could do something like:

Code: Select all
'http://www.yahoo.com/' sstAsUrl  fetch


. . . and parse through the HTML that is returned.

--tc


Thanks for the reply. Sorry for the brief request. Your suggestion is a good start. Is there an existing parser that will parse the returned HTML?
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon
mailto:Lou@Keystone-Software.com http://www.Keystone-Software.com
PhotonDemon
[|]
 
Posts: 176
Joined: Thu Dec 20, 2007 1:45 pm

Re: Extracting data from web pages

Postby tc » Fri Jun 20, 2008 12:52 am

Hello,

I am not aware of an HTML parser but if you are trying extract certain information, given all of ST's string functions should be fairly easy to drill down into the HTML and pull out what you need.

--tc
tc
Moderator
 
Posts: 304
Joined: Tue Oct 17, 2006 7:40 am
Location: Raleigh, NC


Return to VA Smalltalk 7.0, 7.5 & 8.0

Who is online

Users browsing this forum: Yahoo [Bot] and 1 guest