Hello!
I would really like to use the GET object to download the source HTML from a website and then parse that information somehow to split everything into easily usable values, for instance
If I had this HTML
<!DOCTYPE html>
<html>
<body>
<h1>My First Heading</h1>
<a href=http://www.w3schools.com>This is a link</a>
<a href=http://www.w3schools.com>This is also a link</a>
<p>My first paragraph.</p>
</body>
</html>
I would like to be able to loop through it all to convert it into some values like
H10text="My First Heading"
Link0URL="http://www.w3schools.com"
Link0Description="This is a link"
Link1URL="http://www.w3schools.com"
Link1Description="This is also a link"
P0text="My first paragraph"
I've tried using the string parser using '<' and '>' as delimiters, but then I end up with something like this (example of first few lines)
!DOCTYPE html
html
body
h1
My First Heading
/h1
a href=http://www.w3schools.com
This is a link
/a
This is okay, because I can use the string parser to get specific elements by looping through them something like
On loop "elements" +
If element ="h1" -> set Link0Description to (getElementById(Loopindex"loop"+2)),
But that only works when there are not additional things like this
<P><FONT SIZE="4" FACE="Arial, Helvetica, sans-serif">Some example text</FONT></P>
Which has me a bit stumped right now.
Thanks for taking the time to read this, any advice?