Ofx and Google Image Search

Is there a good work around of getting google image search api into OFX?

Given a string and returns a best matched or first found image?

closest thing that will help, is https://github.com/jefftimesten/ofxJSON. But you have to register with Google to get an api key.

Doesn’t use the API but the regularExpressionExample does an image search and returns content

Great resources guys, I will dig into it.

the regularExpressionExample It seems to break.

Hi!

The example is not working for me too. The only thing the app shows is the content of the rawData variable. I think it has to do with these lines of code:

RegularExpression regEx("<//table class=“images_table” style=“table-layout:fixed;” width=“100%” >(.*?)");
        RegularExpression::Match match;
        int found = regEx.match(rawData, match);

(I had to include the // before table to let the text be shown properly on the forum)

The variable rawData contains the right data (html text of the google page) but I get no matches when comparing to the expression. The variable “found” is always = 0. Even if I try other, simpler expressions I get no result.

Is there anyone who can help with this?

I am working with Codeblocks on Windows 8.

Hi!
Same problem here, the example is not working!
Anyone?

It is possible that Google has changed the structure of their image results HTML and the regex no longer works. Have you looked at the raw HTML to confirm this?

@TobiasZ That regex won’t work with the current HTML returned, as @bakercp was suggesting. It looks like google no longer uses table tags to organize the images on the results page. That’s why there are no matches

Thanks to @bakercp and @mikewesthad for your help! Any idea how or where we could find a way to fix this? Unfortunately I don’t think I know regular expression enough to fix it myself… Should I report the bug to the creator of the example?
Thanks

I’d go ahead and report the bug. If you want to move forward on this, you have a couple of options you could pursue:

  1. You could go with @Ahbee’s suggestion and look for an API from google that gives you an easy way to interface with google image search. It’s not entirely clear to me how to do this through the google APIs. The older APIs for doing google image search were shut down. Everything seems to point towards using google’s custom search API.

You could also switch over to Bing. They have a more clearly defined API for what you are looking for - Bing Search API. They’ve got a getting started guide here.

With either Bing or Google API route, you’d need to register and get a free API key. You’ll then be limited to a certain number of searches per month or per day (respectively).

  1. The other route would be to pursue a regular expression that works on the current HTML that is returned when doing a google image search. I think you’d want a regex to look for things like this:

So, you’d want to find a tags that have a class attribute of “rg_l”. Then you’d want to pull out this URL from the hyperlink (it’s the part that follows imgurl=):

http://animalia-life.com/data_images/cat/cat1.jpg

That will give you the image in it’s original size.

  1. The last route would be to find a C++ HTML parser. In Python, I used something called BeautifulSoup. These parsers read the HTML that you give them and then allow you to dig around in it. So in BeautifulSoup, I could just ask for the a tags that have a class of “rg_l”, and I would get a list of all those elements. No need to write a regular expression.

Hope that helps

If you want to use a HTML parser, my https://github.com/bakercp/ofxGumbo addon will pull out image links or other HTML tags, etc. Check out the examples.

1 Like

@carolinebuttet @TobiasZ @fkkcloud

I’ve made a mistake :frowning:

I didn’t realize that the html returned when you google image search in browser is different from the html returned when using ofLoadUrl in openFrameworks. My previous comment was based on looking at the html using dev console in chrome.

The example will be fixed in future releases of openFrameworks, BUT if you are still looking to get this regex example working right now, all you need to do is change line 48 to be:

RegularExpression regEx("<table class=\"images_table\" style=\"table-layout:fixed\" [^>]+>(.*?)</table>");

Oh very nice, thank you!!

Thank you @mikewesthad !

Hi,
ofxJSON does not correctly process JSON returned from CSE. On the other hand, picojson does, so i moved on to using that.

Once I am finished, I will post the whole Google (CSE) image search project here.

Hi,
I am developing the add on (Openframeworks C++ ofx Addon - Google Custom Search (CSE) ) in my spare time. Hopefully I can commit soon.Meanwhile the project is located here:

https://github.com/petafemto/ofGoogleCSE

Sample result for image search:

1 Like

@mikewesthad

Your fix worked until 2 days ago, but it seems that google changed the website layout and it doesn’t work anymore… :open_mouth:

This is the html that is returned:

I have no experience with “regular expression” and have no idea how to change it to make it work again :confused:
Could you help me again? :wink:

You are getting a 302 message, which means the webpage is redirecting. Did you try using the redirected link (read the html in your screengrab)? It looks like it is redirecting you to the hong kong version of google.

If you need an HTTP client that automatically follows redirects, check out ofxHTTP.

1 Like

Ok, thanks! I’ll check that!