Friday, February 27, 2015

Unsupervised learning with distances and medoids

In this post I concern my self with unsupervised machine learning. I use my executable distance
Program from my “What is the distance from notepad.exe to AcroRd32.exe? post to examine executable clustering using
“Partion around medoids”.

The text is here:
The source code and data is here:

Wednesday, February 25, 2015

Primitive unpacking using emulation

In this post I examine unpacking an executable using a CSIM emulator. First I spend a few words on emulation techniques in general and then develop a CSIM made specifically to unpack a specific executable as a potential use case for emulators. The emulator that I made is very primitive and very much a work in progress. After all I wrote in less than 24 work hours. I will probably return to improve the emulator at a later point in time for other purposes.

Text can be found here:
Source codes here;

What is the distance from notepad.exe to AcroRd32.exe?

This blog is about distance measures for executables. It first gives some general considerations and then develops a primitive distance measure. It then goes on to apply this measure to different versions of two distinct executables, and does some distancing with some random unrelated executables to verify that it works properly.
Text here:

Source codes here:

Machine learning and malware with Bayes

This is my first real blog post in 15 years. The subject of this post is using machine learning techniques for malware. I started out way too ambitiously something which meant that this post is significantly below my own expectations. It was the direct reason why I invented the 24 hours limit. The text gives short general introduction to supervised machine learning in general and a short presentation of Bayes-theorem in context of malware. It goes on to estimate a baysian probabillities for identifying an executable packer. You may notice there is lots of code in the source codes which isn't related to the text. It was sacrificed for time. Maybe I'll return to it some time in the future.

Text can be found here:
Source codes here:

Monday, February 23, 2015

Welcome to my blog.

What Stones dream about 

Welcome to my blog. It's been 15 years since I last had a blog so bare with me as I get started. Here I hope to write about: 
  • low-level coding in general
  • malware topics in general,
  • And the use of statistics and machine learning techniques for malware in particular.
I actually started out writing my first blog entry before starting the blog and I fairly quickly figured out that my ambitions where far larger than what I could ever realize. Therefore I've imposed some limits on myself:
  •  I shall not start writing a blog post that I estimate will take me more than 24 works hours to write.
  • Topics should interest me.
  • Focus on accessibility of my chosen topics rather than rigor. It's a blog, not science.
  • To the greatest extend possible any results should be transparent and repeatable.
The first is simply a time constraint because being a malware hobbyist is just one of my hobbies. I go climbing as much as I can, play football and I'm one of them weirdo’s who own a table saw.  On the other hand I wish to spend my time here on topics which are slightly more advanced that explaining how to use the standard tools of the infosec trade. The compromise is that I'll mostly just skim the surface of a topic and that I'll not write tools, but "proofs of concept" type code.

About me
I've been playing with malware and low-level coding since I first figured out how to add signatures to IBM antivirus back in 1992. I've hobby wise written low-level software and reverse-engineered for Dos and Windows since 1996, including executable packers, unpackers, privilege elevation hacks, etc. I got a degree in economics, especially focused on econometrics (which is statistics for economics). Since 2000 I worked professionally in software development. I’ve worked on anything ranging from file systems drivers for win9x/NT over copy protection to video compression codecs and many other interesting things. I'm currently vice president of engineering at Protect Software GmbH.

Anders Fogh

Contact email: “anders_fogh” is the first part of my email.   The last part is “”