Malware Detection Based on Multiple PE Headers Identification and Optimization for Specific Types of Files

This paper follows our previous research where we made a basic experiment to nd out if it is possible to detect malware by multiple PE header detection. The previous results show us that there is a considerable amount of malwares that connect themselves to another le. This paper summarizes our previous results, updates the results and also expands them by adding an optimization method and also by including the scan of another (speci c) types of data.


Introduction
The subject of our research is a malware detection method and testing of other related methods. The aim is to use an unconventional approach to malware detection, to test existing methods, improve them or to create a new method, if the method produces satisfactory results. The currently examined topic is malware detection by using metadata. Malware detection by metadata testing is a topic that is not often covered and this is the reason why we chose it for our research. This paper describes a detection method based on the idea of malware detection by multiple PE headers occurrence.
From our previous research, it is apparent that there are groups of malwares called parasitic viruses they infect other les by connecting their code to the host le. They also often connect their own headers and if the malware is connected by one of the methods described below, the original le contains more than one header.
We are able to detect multiple PE headers and if the scanned le goes through our optimization lter, this le is labelled as dangerous.

Related Work
In

3) Hybrid Analysis:
Hybrid analysis is the combination of static and dynamic analysis. In rst place, static analysis is applied and then software is running in controlled environment [11].

1) Prepending Viruses:
In the process of infection, the virus puts its own code in front of the original le code. If such le is executed, OS runs malicious code instead of the legitimate code rst, but the original code is executed also to hide malicious behavior, so the user cannot identify it.        3) Inserting Viruses:

JOURNAL OF ADVANCED ENGINEERING AND COMPUT
The malicious code is placed at the address to     is also overwritten and the original program will not be able to run without a header reconstruction (instead, the malware will run). If a random section of the original le is overwritten, it is possible that the program will work but it will also probably crash.    Table: Dened in IMAGE_SECTION_HEADER structure. Each section has its own section header and these headers are used to describe each of the following sections. The headers con-

3) Malware Protection:
It is common practice to protect malware against detection and analyzing. When malware is packed, it is more dicult to use static analysis. A sample must be unpacked before. Packer takes the original malware, makes wrapper and creates a new binary le. The whole binary or only part of the le can be packed. PE header must be reconstructed for PE header analysis.
[15] We connected our application to Cuckoo sandbox [21] to perform the detection of obfuscation and also data gathering from such packed applications.

Results
We  It is obvious, that the method introduced in this paper gives satisfying results in false positive rate unlike some other methods. However, this is mainly because our method is designed for parasitic viruses only and some types of les must be excluded to get more FPR accuracy.
Other described methods are used for wide range of malware.

Work
The experiments show that examined method can detect specic malware types that are called parasitic viruses. The problem seems to be the installers and uninstallers these types of les are widely used by both user-friendly applications and malware. It is better to exclude these le types for the described detection method.
Multiple PE detection method can be used as a base for nding malware that is in some way connected to a host le, but due to the fact that there is a small amount of legitimate applications also containing multiple headers, we cannot recommend using this method as a standalone method or as a main detection method.
But we can recommend it as a fast method to label suspicious les for deeper analyzing.
There is space for improving our detecting method. We will use introduced method together with multiple other methods that would not have negative time impact. In our following research, we will try to nd out, if there exist Windows API functions that are often used by malware. Scanning these functions in samples with multiple PE headers will improve results of our method. In future, malware detection will be improved by adding AI as well.
Malware writers implement mechanisms to avoid multiple infections of the same le by parasitic viruses. Following research will include a deeper look at these mechanisms to use them for malware detection together with our hereby presented method.