Security Property of Formal Method
Information leak occurs when a malicious application col-lects sensitive personal information and sends the informationto an adversary without user’s acknowledgement and consent.It is becoming one of the most significant security issues inrecent years , .The topic of information flow or data flow is an oldtopic and it has been studied quite a lot in the past. Taintanalysis is a method to track the information flow in theapplications. Basically, it labels the interested variables astainted variables and through monitoring all the access tothose tainted variables, taints of the tainted variables willbe propagated into new variables which has relations to theoriginal tainted variables.However, taint analysis suffers from problem of precise andaccuracy paradox. On one hand, if the taint analysis sets a toocoarse grained taint propagating rule such as for every variablethat reads the tainted variable, it will also become taintedvariable, then under this rule it may taint too many variables.However, not all of the tainted variables are actually relatedto the origin tainted variables in the information flow and thusit reduces the precision of the analysis; on the other hand,if it sets a too rigorous propagating rule, it may miss certainvariables that are actually related to the tainted variables andas a result reduces the accuracy of the analysis.Like normal program analysis approaches, it can also becarried out dynamically or statically. Both of them have theiradvantages and disadvantages. Normally, dynamic analysis ismore precise than the static analysis because dynamic analysiscan access to runtime information of the program. However,it also suffers from low code coverage problem. It can hardlycover all the branches and codes in the program and may misssome information flows. In some extreme cases, the maliciousapplication can detect whether it is being analyzed and changeits behaviour which renders dynamic analysis in vain.While contrary to dynamic analysis, static analysis has highcode coverage which potentially enables it to identify all thepossible information flows. However, it is hard to scale to alarge application as there might exists so many informationlows. Without the runtime information, however, it weakensthe precision of static information flow analysis.CodeShoppy
Difficulties of Analysis for AndroidOne of the major difficulties of performing an accurateand precise information flow analysis on Android is theunavailability of a precise Android model. This is also onelimitation of previous research such as in this two papers ,. The missing part of Android model disables them toimprove the precision. Besides, Android operating system hasa different interaction method with users and inter-componentcommunication mechanism among the apps. To build a full-fledged Android model is too difficult because of the sizeof Android OS , its rich APIs and also the close sourcecode or the native code which involves multiple programminglanguages other than pure Java.Another difficulty comes from the inherent design paradigmof Android. All the applications in Android is event-driven.Whenever a certain event comes, a corresponding callback inthe application will be invoked. So, this increases the difficultyof analysis because one callback function have multiple flowsthat would run through and the event orders are also hard toenumerate.B. Expected OutcomeOne of the current state-of-the-art research on this problemis Droidsafe . It claims to have developed a most full-fledged Android model so far to facilitate the information flowanalysis. Thus, it achieves the most precise and accurate resultcomparing to the former state-of-the-art framework FlowDroidon three sorts of benchmark: DroidBench and the other twosorts of benchmark developed by the same team of Droidsafe.However, one of the major drawbacks in this work isthat the information flow analysis is carried out in a flow-insensitive way. This, although covers all the possible eventorders and flows, decreases the precision of the analysis asnot all of the event orders are possible in Android operatingsystem. Therefore, one possible improvement to this work isto consider the possible event orders and to eliminate thoseflows that are impossible to occur. This contains two steps;first step is still using Droidsafe to detect all the informationflows between sources and sinks; then the second step is touse the possible event orders to eliminate the impossible flowsso that only the valid flows remain.Another drawback is Droidsafe still can not handle theimplicit flow. Therefore, another possible improvement for thiswork is to apply the state-of-the-art implicit flow detection ontop of the current work.
EvaluationAs the improvement is built on top of the original Droidsafe,one intuitive method of evaluating is to reuse the benchmarkswhich evaluate Droidsafe. Besides, as the improvement ison the information flow of possible event orders, it is alsonecessary to develop a new suit of benchmark to evaluate theflows targeting possible event orders