<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=utf-8"><meta name=Generator content="Microsoft Word 15 (filtered medium)"><!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]--><style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman",serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:#954F72;
text-decoration:underline;}
p.msonormal0, li.msonormal0, div.msonormal0
{mso-style-name:msonormal;
mso-margin-top-alt:auto;
margin-right:0in;
mso-margin-bottom-alt:auto;
margin-left:0in;
font-size:12.0pt;
font-family:"Times New Roman",serif;}
span.EmailStyle18
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri",sans-serif;
mso-fareast-language:EN-US;}
@page WordSection1
{size:8.5in 11.0in;
margin:70.85pt 70.85pt 56.7pt 70.85pt;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=DE link="#0563C1" vlink="#954F72"><div class=WordSection1><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US'>Hi Sumaya,<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US'><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US'>I’m not sure that your approach is likely to scale to realistic apps. For every source, FlowDroid needs to track a taint abstraction through the program. With thousands of sources, I don’t think the analysis will terminate in any realistic time frame. You normally have a few dozen or maybe a few hundred sources that apply to a single application, but not thousands.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US'><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US'>I’d suggest that you only specify those methods as sources that you are actually interested in. It might very well be the case that data from some method foo() is passed to native code, but if the return value of foo() is not of interest, it doesn’t really matter what the native code does with it.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US'><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US'>Secondly, concerning the callgraph: SPARK’s callgraph is incomplete, because it needs to propagate type information from allocation sites to call sites. Therefore, if there is no call site (e.g., because the call site is hidden inside a factory method in the OS), the calls on the respective base object are missing from the CG. FlowDroid handles these cases through StubDroid summaries, and does not rely on the SPARK CG alone.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US'><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US'>Best regards,<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US'> Steven<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US'><o:p> </o:p></span></p><p class=MsoNormal><b><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif'>From:</span></b><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif'> Soot-list <soot-list-bounces@cs.mcgill.ca> <b>On Behalf Of </b>Sumaya Abdullah A Almanee<br><b>Sent:</b> Thursday, April 11, 2019 12:35 AM<br><b>To:</b> soot-list@cs.mcgill.ca<br><b>Subject:</b> [Soot-list] [Android][FlowDroid][SPARK] A question about the precision of taint analysis in Flowdroid (and possible false negatives in spark)<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p><div><div><p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p></div><p class=MsoNormal>Im currently using FlowDroid to simply track taint propagations between certain sources and sinks. since I'm performing a separate analysis on some native libraries of Android apks, I've decided to leverage FlowDroid to track any taints passed/leaked from the <b>dalvik</b>-side to the <b>native-</b>side.<o:p></o:p></p><div><p class=MsoNormal>The way I configured the Source_Sink files is by first examining the reachable functions in the call graph generated by FlowDroid (using spark) and then marking these reachable functions as follow: any native function is marked as _SINK_ and everything else as _SOURCE_.<o:p></o:p></p><div><div><p class=MsoNormal><o:p> </o:p></p></div><div><div><p class=MsoNormal>I obtained some initial results. A small snippet of these results is shown below: (The results highlighted in yellow are the ones that Im mainly interested in)<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><div><div><div><p class=MsoNormal><img width=602 height=185 style='width:6.2708in;height:1.927in' id="_x0000_i1025" src="cid:image002.png@01D4F39C.532776D0" alt="Screen Shot 2019-04-10 at 3.04.27 PM.png"><o:p></o:p></p></div></div></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>Based on the way I constructed the sources and sinks config file I was expecting more leaks to be reported. If I understand correctly these results might contain <b>false positives</b> for example in the case of arrays or collections (due to over-approximations). However, FlowDroid is unlikely to miss any leaks (low <b>false negatives</b> rate). Is this correct? What I'm trying to figure out here is:<o:p></o:p></p></div><div><p class=MsoNormal>1) An estimate of false positives or false negatives in FlowDroid's reported leaks. <o:p></o:p></p></div></div><div><p class=MsoNormal>2) Possible reasons why some leaks might be missing (false negatives)?<o:p></o:p></p></div><div><p class=MsoNormal>3) Since FlowDroid is relaying on the call graph for reporting taints (in this case SPARK) and since the absence of a node in the graph might result also in missing reported leaks. I was wondering is there's also an estimate of false negatives in Sprak?<o:p></o:p></p></div></div></div></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>I really appreciate your time and help with this!<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>Best,<o:p></o:p></p></div><div><p class=MsoNormal>Sumaya<o:p></o:p></p></div></div></div></body></html>