image recognition for fixed numbers
I've never played this game, but if it has a system similar to my own real bank acct, then basically the numbers/letters of a PIN plus a few more random characters are arranged in boxes that you need to click in the correct order. It resembles a typewriter.
Anyway, the images inside those boxes are always the same, but the order changes each time you need to authenticate.
So, I assume that you always know the button locations (because they are fixed) and the same number of extra random characters is the same as well.
That means you can capture a rectangle area for each button and you want to know what character is inside it. So you need a copy of your PIN character images to do the comparison. If your program always captures the button locations exactly the same, then you should be able to do an exact image comparison.
Just so you know, if you had a 'noisy' image (an imperfect copy), then you need to do a 2-d convolution operation between your reference image(s) and the button image and you pick the reference image that had the highest output from the convolution. You should refer to a signal processing book to understand this operation, but basically it refers to multiplying the reference image with the other image as it is shifted and doing an integration. I don't think you'll need this, unless the image scale changes. Then you would re-scale one of the images to match the size of the other (leading to a 'noisy' image) and requiring convolution to find a match.
If your buttons aren't fixed images, but look like those weird stretched characters that you need to type to submit a comment on blogs, then the image recognition becomes considerably more complicated. Not worth your time.