Tuesday, June 14, 2016

Building a Poker Bot: String and Number Recognition

This is the second part of Building a Poker Bot series where I describe my experience developing bot software to play in online poker rooms. I'm building the bot with .NET framework and F# language which makes the task relatively easy and very enjoyable. Here is the first part: Building a Poker Bot: Card Recognition

Why string recognition

Reading cards and other fixed images was the first step. The bot should also be able to read different text-based information from the screen, e.g.

Current blind levels
Current pot size
The size of bets made by each player
Player names
Stack sizes
Chat messages (for advanced scenarios)

We need this vital information to make proper decisions, so let's look at how to parse the textual data.

New challenges

String recognition has some specific difficulties when compared to fixed images like cards:

The size of a string is not predefined. Obviously, the longer the string, the more space it takes on the screen
The position of a string is not fixed either. Some strings are aligned to the center, others may diverge based on other variable parts like stakes or blinds
Different strings might be rendered in different font size

Here is what needs to be done to overcome these complications:

Pick the layout which makes your life easier
Adjust fonts and positions if possible
Make sure that all important strings are always visible and not overlapping to other information
For each string define a region where it belongs to in 100% cases. The background of this region should be more or less evenly filled with a color in contrast to the font color.

String recognition steps

We start with a screenshot of a poker table again:

We know our fixed regions where our labels are located, so we take those regions for processing:

For each region we trim away the blank margins around the text (i.e. left, top, right and bottom padding):

We find dark lines between bright symbols and we consider them as gaps between characters:

The final step is to compare each symbol to the known patterns and find the best match (in case of my layout the match for symbols is always 100% perfect). Let's look how these steps are implemented.

Removing padding around the text

Because the padding is removed from all 4 sides of the region, I decided to use Array2D data type to be able to iterate in different order. The whole algorithm operates with black or white points defined as a helper type:

type BW = B | W

So the removePadding function has type of BW[,] -> BW[,] and looks like this:

let removePadding pixels =
  let allBlack s = Seq.exists ((=) W) s
  let maxWidth = Array2D.length1 pixels - 1
  let maxHeight = Array2D.length2 pixels - 1
  let firstX = [0..maxWidth] 
    |> Seq.tryFindIndex (fun y -> allBlack pixels.[y, 0..maxHeight])
  let lastX = [0..maxWidth] 
    |> Seq.tryFindIndexBack (fun y -> allBlack pixels.[y, 0..maxHeight])
  let firstY = [0..maxHeight] 
    |> Seq.tryFindIndex (fun x -> allBlack pixels.[0..maxWidth, x])
  let lastY = [0..maxHeight] 
    |> Seq.tryFindIndexBack (fun x -> allBlack pixels.[0..maxWidth, x])

  match (firstX, lastX, firstY, lastY) with
  | (Some fx, Some lx, Some fy, Some ly) -> pixels.[fx..lx, fy..ly]
  | _ -> Array2D.init 0 0 (fun _ _ -> B)

The first part finds the amount of fully-black columns and rows in the array. Then, if white points are found, the second part returns a sub array based on the indices, otherwise empty array is returned.

Split the text into characters

First, we convert our 2D array into the list of lists, where each item in the top-level list represents a single column of pixels:

let pixelColumns =
  [0..Array2D.length1 pixels - 1] 
  |> Seq.map (fun x -> pixels.[x, 0..Array2D.length2 pixels - 1] |> List.ofArray)

Then we can fold this list of columns into the symbols, where each symbol itself is the list of columns:

let splitIntoSymbols (e : BW list) (state: BW list list list) = 
  match state with
  | cur::rest ->
      if isSeparator e then
        match cur with
        | _::_ -> []::state // add new list
        | _ -> state        // skip if we already have empty item
      else (e::cur)::rest   // add e to current list
  | _ -> [[e]]

Seq.foldBack splitIntoSymbols pixelColumns []

The type of state is a bit of brain teaser, I guess it could be improved by introducing some intermediate type with descriptive name, but I decided to leave that part for now. Read it as list of symbols, which are lists of columns, which are lists of pixels.

Match the symbols vs the known patterns

This part was already described in my first article. Basically we compare the list of black or white points to the patterns of the known symbols:

let getChar patterns bws =
  let samePatterns h p =
    Seq.zip h p
    |> Seq.forall (fun (v1, v2) -> v1 = v2)
  let matchingPattern = 
    patterns 
      |> Array.filter (fun p -> List.length p.Pattern = List.length bws)
      |> Array.filter (fun p -> samePatterns bws p.Pattern)
      |> Array.tryHead
  defaultArg (Option.map (fun p -> p.Char) matchingPattern) '?'

Putting it all together

The recognizeString function accepts lower-order functions to match symbols and get pixels together with width and height of the region:

recognizeString: (BW list list -> char) -> (int -> int -> color) -> int -> int -> string

It builds an array of pixels, removes padding and folds with recognition.

let recognizeString matchSymbol getPixel width height =

  let pixels = 
    Array2D.init width height (fun x y -> isWhite (getPixel x y))
    |> removePadding

  let pixelColumns =
    [0..Array2D.length1 pixels - 1] 
    |> Seq.map (fun x -> pixels.[x, 0..Array2D.length2 pixels - 1] |> List.ofArray)      

  Seq.foldBack splitIntoSymbols pixelColumns []
  |> List.map matchSymbol
  |> Array.ofSeq
  |> String.Concat

Then we use it with a specific recognition patterns, e.g. known digits in case of numbers recognition:

let recognizeNumber x =
  recognizeString (getChar numberPatterns) x

A way to produce these patterns is discussed in the previous part.

Conclusion

String recognition takes a bit more steps to execute comparing to the recognition of fixed objects. Nevertheless it's pretty straightforward to implement once we split it into small and well-understood conversion steps. The full code for card recognition can be found in my github repo.

http://mikhail.io/2016/02/building-a-poker-bot-string-recognition/


Security breaches



Title: twin hantu's login trojan horse
Machine: burro.monkeybrains.net 
OS: Linux 2.1?
URLs: 1. http://www.sans.org/y2k/050900-1500.htm
      2. http://staff.washington.edu/dittrich/misc/trinoo.analysis
Summary (notes taken during comprimise analysis):

The machine is comprimised (portmap?? -- still unsure how initial breach is made).
A 'twin' user is created with UID=0 and HOME=/.
A 'hantu' user is created and then erased
The 'twin' line is edited out of the /etc/passwd file with pico.
The /etc/shadow retains the 'twin' user.
The target machines 'login' is replaced with a trojan horse.
This Trojan horse allows root access for incoming telnets with a specific term setting. This vt number can be found by doing a 'srtings login | grep vt'
A UDP controlled server named 'ns' is installed (a ps -aux reveals a ./ns). This client sends a *HELLO* packet when started up to a client (it's IP is availible from a 'strings ns'). The ns on burro was installed in /daemon/ns). This is how I was alerted to burro's infection: burro was ping flooding other machines on the internet with this 'ns' client. (Please see url #2 above
The attacker leaves behind a .bash_history file which reveals several more tid-bits.
1) The ftp host which houses the 'bj.c' which is compiled to make the trojan login.
2) Other machines the user leap frogs to from your machine. All you have to do is set term=vt???? where ???? indicates a number from 1000-9999 and you too can access other compromised machines.
3) Most commands are issued through a client side script. 'twin' doesn't really know Unix.
4) Of course, this .bash_history file could be a plant, but I'm leaning toward a not-too-bright user senario.
More informaion is found in other system log files (eg originating IPs for telnets)

Time to reformat that machine with FreeBSD!!!
 


Another breakin this week... at a place I contracted at for a few hours.  
They too were running Linux.  I patched up all the messed up binaries 
with new rpm... More info



Here are the people (and bots) who have looked at this page:
gunzip -c /www/logs/archive/access-www.monkeybrains.net.gz | grep ' /security' | awk '{print $1}' | sort -u | nslookup | grep Name:
*** lala.monkeybrains.net can't find 208.37.12.165: Non-existent host/domain
*** lala.monkeybrains.net can't find 208.48.124.4: Server failed
*** lala.monkeybrains.net can't find 212.150.51.90: Non-existent host/domain
*** lala.monkeybrains.net can't find 216.34.109.191: Non-existent host/domain
*** lala.monkeybrains.net can't find 216.34.109.192: Non-existent host/domain
Name:    ras-c5800-1-49-73.dialup.wisc.edu
Name:    kremlin.cs.uidaho.edu
Name:    mail.skynet.gr
Name:    ss06.ny.us.ibm.com
Name:    ss11.ny.us.ibm.com
Name:    AKCF1.xtra.co.nz
Name:    aspseek.swusa.com
Name:    208.184.110.33.svwh.net
Name:    marvin.northernlight.com
Name:    lb1.antarcti.ca
Name:    j6000.inktomi.com
Name:    cr032r01.bos2.fastsearch.net
Name:    router-sj.atomz.com
Name:    gw03.webtop.com
Name:    gw04.webtop.com
Name:    www.britton-gw-uk.proteusweb.com
Name:    adsl-216-103-213-34.dsl.snfc21.pacbell.net
Name:    dhcp-197.sf.bmarts.com
Name:    www.ip3000.com
Name:    www.ip3000.com
Name:    d83b38fc.dsl.flashcom.net
Name:    adsl-63-203-32-98.dsl.snfc21.pacbell.net
Name:    adsl-63-203-75-141.dsl.snfc21.pacbell.net
Name:    crawler3.googlebot.com
Name:    crawler1.googlebot.com
Name:    crawler2.googlebot.com
Name:    router-sc.atomz.com


This page was created to keep track of security breaches on the 
MonkeyBrains network.

(I hope rk is friendly hehehe)

https://www.monkeybrains.net/security/

BLACK MASK