Cloud computing is becoming a very popular computing paradigm because of its advantages, such as lower cost and high performance computing power. This increasing popularity has brought to light security concerns as well as interests from the law-enforcement community regarding cloud forensic investigations in the cloud environment. However, today’s cloud computing architectures often do not offer the necessary support for computer forensic investigations. Collecting logs, a very important activity in digital forensics, is not trivial in a cloud computing environment. Another problem is how to provide cloud logs to investigators while preserving users’ privacy and the integrity of the logs.
Collecting logs in a secure way from a cloud environment is challenging due to reduced level of control over the cloud environment by the users and investigators, while in traditional forensics they have a reasonable control over the infrastructure. Unlike the traditional forensics, in a cloud environment, logs are not located at a single log server, but they are distributed among several servers. Another point is that, in cloud computing, multiple virtual machines can share the same physical infrastructure. For example, logs for multiple customers may be in a single place. The essence of a cloud infrastructure differs from the traditional single owned computer structure.
The first step of the traditional forensic procedure is to gain the physical control of the evidence, such as computer, hard disk, and so on. However, in the context of cloud forensics, this step is almost impossible due to the laws, procedures, and proprietary technology in cloud environment. In addition, for traditional computer forensics it is relatively easy to prove the integrity of network and process logs in front of jury if compared to cloud computing, which has a much more complex structure.
Currently, investigators are dependent on the cloud service provider to collect logs from cloud, and they must trust the CSP as it is not possible to verify whether the provided logs are truthful or not. For example, in , Amazon claims that they were able to locate a Zeus botnet controller and take it down. They also claim that whenever an abuse of their services occurs, they will take seriously while preserving other cloud users’ privacy. We cannot verify if this is true because they never explain in details what methods they use to meet that claim. Therefore, we need to investigate a better way of managing logs in a cloud environment.
Following are the organization of the remainder of this paper. In Section II., we will describe related works that will provide state of the art of this topic. In Section III., we will discuss the current best technical solution for cloud forensic logging problem. In Section IV., we will summarize the conclusion followed by references.
2. Related Work
Birk et al. (2011)  has worked on investigating technical issues with forensics investigation in the cloud environment. However, they did not implement a concrete solution. Instead, they only offer solution suggestion.
Marty (2011) , has worked on providing a guideline for logging standards in a cloud environment. This standard includes what to log, meaning that every log entry should log the event, when the event happened, who caused the event, and why it happened. Marty further said that at a minimum, the following fields need to be collected in every log record: Timestamp, Application, User, Session ID, Severity, Reason, Categorization. These fields will help in answering the questions: when, what, who, and why.
Along that line, Zafarullah et al. (2011)  showed that collecting necessary logs from the cloud infrastructure is possible. They implemented a solution to collect necessary logs from Eucalyptus, an open source cloud infrastructure management software. Birk et al. (2011)  and Dykstra et al. (2012)  otherwise proposed the use of public APIs or management consoles to mitigate the challenges in the process of log acquisition. Dykstra et al. further mentioned that the management console as well as the APIs require an additional level of trust.
However, none of them proposed any scheme for storing the logs in a cloud and making it available publicly without compromising the security. In the next section, we will show current best solution to diminish these challenges. A combination of all the previous solutions could help to make cloud environment more forensics-friendly.
3. Technical Discussion
A. Current Best Method
Figure 1. Overview of the Proposed System
As developed by Zawoad et al. (2016) , the current best solution for cloud forensic relies on the basic idea of public key cryptography. The cloud provider and the law enforcement should both have their own public key and secret key. Protecting users’ privacy will be accomplished by encrypting the logs with law enforcement public key. So that only law enforcement agency will be able to access users’ logs with their private key. Moreover, the logs will be stored publicly in a secure way using cloud provider’s secret key. So that we know that the cloud provider generates the logs. Then APIs will be provided to forensics investigators so that they can access the logs using the cloud provider’s public key. Finally, an auditor from the court authority will be able to verify the integrity of the logs using PPL and LC.
The threat model assumes that users, investigators, and cloud providers can collude with each other to provide fake logs to the auditor. Possible attacks that are considered are log modification, privacy violation, repudiation by user, and repudiation by cloud provider. Following is detail of the process flow in collecting the logs securely while preserving users’ privacy. This process flow is using network logs generated by Snort as an example. The rest of the process starting from (b) will the same for other kinds of logs.
Figure 2. Process Flow of Retrieving Log and Storing the PPL
- The parser module first communicates with the log sources to collect different types of logs. For example, the parser can listen to Snort , a free lightweight network intrusion detection system to store network logs.
- After acquiring logs from different sources, the parser then parses the collected logs and generates a Log Entry LE. For example, a Log Entry LE for a network log could be defined as follows:
LE =< FromIP; ToIP; TL; Port;UserId >;
where TL is the time of the network operation in UTC.
- The parser module then sends the Log Entry LE to the logger module to further process the LE.
- To ensure the privacy of users, the logger module can encrypt some confidential information of the LE using the public key of LEA PKA and generates the Encrypted Log Entry ELE as follows:
ELE =< EPKA(ToIP; Port;UserId); FromIP; TL >
As searching on encrypted data is costly, some of the fields of LE that often appear in search can be unencrypted. For network logs, some crucial information that we can encrypt includes: destination IP (ToIP), and user information (UserId).
- After generating ELE, the logger module then creates the Log Chain LC to preserve the correct order of the log entries. The LC is generated as follows:
LC =< H(ELE;LCPrev) >;
where LCPrev is the Log Chain of the log entry that appears before the current log entry.
- The logger module now prepares an entry for the persistent log database, which we denote as DBLE. The DBLE is constituted of ELE and LC DBLE =< ELE;LC >
- After creating the DBLE, the logger module communicates with the proof storage to retrieve the latest accumulator entry.
- In this step, the logger generates the proof of DBLE, i.e., the logger creates the membership information of the DBLE for the accumulator. The logger then updates the last retrieved accumulator entry with this membership information.
- The logger module sends the updated accumulator entry to the accumulator storage to store the proof.
- At the end of each day, the logger retrieves the last accumulator entry of each static IP, denoted as AED.
- Using the AED, the logger now generates the Proof of Past Log PPL as follows:
PPL =< AED; TP ; SigSKC (AED; TP ) >;
where TP represents the proof generation time, and SigSKC (AED; TP ) is the signature over (AED; TP ) using the secret key of the CSP, SKC.
After computing the PPL, the logger module will publish the PPL on the web. The PPL can be available by RSS feed to protect it from manipulation by the CSP after publishing. We can also build a trust model by engaging other CSPs in the proof publication process. Whenever one CSP publishes a PPL, that PPL will also be shared among other CSPs. Therefore, we can get a valid proof as long as one CSP is honest.
B. Strengths and Weaknesses Discussion
This method’s contribution is that it incorporates best ideas from previous works and develop a concrete solution. However, there is possible major privacy breach since the law enforcement agency will be able to see all users logs. Although it is necessary for them to access it when the user is involved in an incident, they should not access it when the user is honest. Suggested solution such as Kerberos  can only provide authentication protocol, which will help ensuring the law enforcement agency, and not someone else will access it. However, Kerberos still cannot prevent the law enforcement agency from accessing honest users’ logs. Although we can designed rules that will make sure the logs are only generated on suspicious activities, these rules will always have false positives and false negatives.
As we have studied, we found out that currently there is no perfect solution for cloud forensics logging mechanism. The current best solution that we found has a possibility of major privacy breach that require trusting the law enforcement to access users’ logs.
- Zawoad, A. K. Dutta and R. Hasan, “Towards Building Forensics Enabled Cloud Through Secure Logging-as-a-Service,” in IEEE Transactions on Dependable and Secure Computing, vol. 13, no. 2, pp. 148-162, March-April 1 2016.
- Patrascu, A., & Patriciu, V. (2015). Logging for Cloud Computing Forensic Systems. International Journal of Computers Communications & Control, 10(2), 222-229.
- Zeus botnet controller. http://aws.amazon.com/security/security-bulletins/zeus-botnet-controller/.
- Dykstra and A. Sherman. Acquiring forensic evidence from infrastructure-as-a-service cloud computing: Exploring and evaluating tools, trust, and techniques. DoD Cyber Crime Conference, January 2012.
- Zafarullah, F. Anwar, and Z. Anwar. Digital forensics for eucalyptus. In Frontiers of Information Technology (FIT), pages 110–116. IEEE, 2011.
- Marty. Cloud application logging for forensics. In In proceedings of the 2011 ACM Symposium on Applied Computing, pages 178–184. ACM, 2011.
- Nurmi et al., “The Eucalyptus Open-Source Cloud-Computing System,” Cluster Computing and the Grid, 2009. CCGRID ’09. 9th IEEE/ACM International Symposium on, Shanghai, 2009, pp. 124-131.
- Eucalyptus, http://www8.hp.com/us/en/cloud/helion-eucalyptus-overview.html
- Birk and C. Wegener, “Technical issues of forensic investigations in cloud computing environments,” in SADFE. IEEE, 2011, pp. 1–10.
- Snort, www.snort.org
- Kerberos, http://web.mit.edu/kerberos/