Google App Engine and socket.inet_ntoa

Yesterday I started tackling a new problem in Python: How to parse a pcap file. The idea is to host something on Google App Engine and have it do the work. I started working with the Python library dpkt to open the pcap file. But pcap files show the source and destination ip addresses as binary packed decimals. That was  a format I never heard of till now so I had no idea what to do with it. It turns out that socket.inet_ntoa will convert it to the string you are used to seeing, like

Unfortunately, Google App Engine doesn’t provide the socket library. So I figured there had to be a way to build this functionality on my own. Let me tell you. Considering how much (erm, how little) I know about Python, this was no easy task. First I had to figure out what that library function does.

Well, the socket library in Python is simply a wrapper to the C socket library on the OS. Finding that source code ended up being fairly easy, but it didn’t really help me much. It wasn’t until I stumbled onto this article on StackOverflow that I started getting somewhere. But you may notice that that’s a Java example and I am looking for Python.

It was pretty trivial to translate that code from Java to Python:

But that wasn’t getting me what I needed. When I tried to use it, I got: TypeError: unsupported operand type(s) for >>: ‘str’ and ‘int’. It wasn’t making much sense to me. So I tried a bunch of things to try to understand what this number was. After a while I saw that the number I was getting from Wireshark via dpkt was ‘x01nx01’. Again, not making much sense.

It wasn’t until I went back to Wireshark that I started cluing in on the problem. I looked at the packet and found the section that mentioned the source IP address. It said Clicking on that ip address highlighted the hex representation below: 0a010a01. From the classes I teach on parsing text files using Datagrabber which is part of our Alchemy product, I know that the hex representation of a newline is 0a. That explains the 0a or n in the middle of the src address above.

Now that I had a better understanding of what dpkt was spitting out for the ip address, I started looking for ways to convert that into a more usable format. Thats when I stumbled onto this StackOverflow discussion on converting hex strings to IP addresses. Struct.unpack(‘!I’,number) was the key.

The resulting function to replace socket.inet_ntoa is listed here:

Submit a Comment

Your email address will not be published. Required fields are marked *