You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

3.3 KiB

Method

These files were obtained using reconnecting bots.

I have observed that while connecting through an IP that is not reputable (e.g, a VPN or proxy IP), users are shown a captcha when they log in. This captcha looks a bit like this:

A test image of a captcha

These captchas are a bit funny, because they can be solved easier using a bot than by a human. However, I noticed that other AI-based solutions, such as Tesseract (OCR) or even paid services such as NopeCHA consistently provide an incorrect or invalid result. Therefore, I set out to create a captcha solving AI model for this!

Data collection

Data was collected using Mineflayer bots on NordVPN to consistently get Minehut to offer the client captchas. Mineflayer doesn't provide a native interface to manage maps, so I had to get a bit creative. Minecraft uses a limited set of colors, so I found a premade map between colors that the server offers and hexadecimal colors that can be parsed in an image. 248 colors were mapped.

Maps are in item frames, so they had to be combined twice, first vertically, then horizontally. They were assigned UUIDS.

Classification

Technically I could have manually classified these files, but I instead paid a few dollars to get human classifiers at 2Captcha. 500 images were downloaded by the bot, 438 of them were usable. Human classifiers identified and classified 410 files (some captchas are unsolvable because the letters are too distorted).

Neural Network

Tensorflow (Keras) was used to train the neural network. A Convolutional Neural Network was used to train this neural network. RELU was chosen as an activation function, because sigmoid correctly identified the characters about half of the time.

Images and strings (labels) were encoded using Numpy.

def convert_text_to_array(text, char_to_idx, max_len):
    text_array = [char_to_idx[char] for char in text]
    padded_text_array = np.pad(text_array, (0, max_len - len(text_array)), 'constant')
    return padded_text_array

def convert_image_to_array(image_path, width, height):
    img = Image.open(image_path).resize((width, height))
    img_array = np.asarray(img) / 255.0
    return img_array

Prediction

Prediction is simple: the model predicts what the characters are given an image that is passed in, and that is converted back to a string.

Downloading

Models are available for download in the files folder

Running

A Java and Javascript (Mineflayer) client is coming soon, as well as a C# API (for use in OQMinebot and MinecraftConsoleClient).

For now, it runs in python/google colab well. I haven't yet tested it on a cpu-only machine, but it's light enough to work well on one, with a few fixes.

Why?

This project was made for fun, and to showcase how easily captchas can be broken. Timing myself, I solve captchas in about 8 seconds. The AI can solve it in 22-25ms. Please don't use image captchas to secure anything, they've been outdated for years now.

I provide no guarantee for support, but you can hire me for projects at Incomprehensibl#9989, @JIBSIL (Telegram) or tech101file@gmail.com, whichever you prefer. If something breaks I will do my best to fix it, just open an issue on Github or here.