Hello all,
I was wondering is there any library that can parse HTML into structs/classes with the parent, child, attributes and content of the inputted HTML tag. I’m having a boring time rewriting my own and I need extract information from wikipedia for a new website I’m working on. I know it easy to do in JavaScript and JQuery, but I’m more familiar with openFrameWorks and I want system that will update when page is updated, and I plan to put the extracted data into MySQL… So I’m unsure how to handle that with PHP and JavaScript.
This is an example of the kind of struct I want, however I’m not asking anyone to write code for me! I’m just wondering if it already been written.
struct HTML
{
std::string tag;
std::string attibutes;
std::string contents;
HTML * parent;
};
All the Best