ROSE 编译器框架/程序分析

概述

ROSE 已实现以下编译器分析

调用图分析
控制流图
数据流分析：包括活性分析、定义-使用分析等。
依赖分析
副作用分析

控制流图

ROSE 提供了控制流图的几种变体

虚拟控制流图

虚拟控制流图 (vcfg) 在需要时动态生成。因此，ROSE AST 与其对应的控制流图之间不会出现不匹配。缺点是，每次需要时都会重新生成相同的 vcfg。这可能成为性能瓶颈。

事实

文档：虚拟 CFG 在 ROSE 教程的**第 19 章虚拟 CFG**中有所介绍 pdf
源文件
- src/frontend/SageIII/virtualCFG/virtualCFG.h
- src/frontend/SageIII/virtualCFG/virtualCFG.C // 不仅提供 virtualCFG.h 的定义，还在 VirtualCFG 中扩展 AST 节点支持
- src/ROSETTA/Grammar/Statement.code // SgStatement 节点等成员函数的原型
- src/ROSETTA/Grammar/Expression.code // SgExpression 节点等成员函数的原型
- src/ROSETTA/Grammar/Support.code // SgInitialized(LocatedNode) 节点等成员函数的原型
- src/ROSETTA/Grammar/Common.code // 其他节点等成员函数的原型
- src/frontend/SageIII/virtualCFG/memberFunctions.C // 每个 AST 节点与虚拟 CFG 相关的成员函数的实现
  - 此文件将有助于生成 buildTree/src/frontend/SageIII/Cxx_Grammar.h
测试目录：tests/CompileTests/virtualCFG_tests
一个点图生成器：为原始或有趣的虚拟 CFG 生成点图。
- 源代码：tests/CompileTests/virtualCFG_tests/generateVirtualCFG.C
- 安装在 rose_ins/bin 下

如何扩展 VirtualCFG 以支持 OpenMP

如何在中添加 SgOmpClause 的 CFGNode
1. 识别 ROSETTA 中前端的类名

例如，如果 SgOmpPrivateClause 或 SgOmpSharedClause 在 VirtualCFG 中不受支持，则有必要检查 buildTree/src/frontend/SageIII/Cxx_Grammar.h 是否具有为添加 SgOmpClause 的 CFGEdge 的函数原型，例如 SgOmpClause::cfgInEdge() SgOmpClause::cfgOutEdge() 如果没有原型，则意味着您的 CFGNode 不属于 SgExpression、SgStatement 和 SgExpression。SgOmpClause 可以添加到 src/ROSETTA/Grammar/Support.code 中，

2. 在 src/frontend/SageIII/virtualCFG/memberFunctions.C 中添加函数定义，以给出添加 CFGNode 和 CFGEdge 的定义

  step1: construct SgOmpClause::cfgndexForEnd()
            this index is based on the AST graph of your source code, the index is explicit in AST node
   real example:

SgOmpClauseBodyStatement::cfgIndexForEnd() const {

  int size = this->get_clauses().size(); // the number of clauses in #pragma omp parallel
  return (size + 1); // clauses + body

}

  step2: construct cfgInEdge() for this CFGNode
           please refer to AST, since AST can show all node information,
           real example:
           std::vector<CFGEdge> SgOmpClauseBodyStatement::cfgInEdges(unsigned int idx) {
 std::vector<CFGEdge> result;
 addIncomingFortranGotos(this, idx, result);

if( idx == 0 )

  {
   makeEdge( getNodeJustBeforeInContainer( this ), CFGNode( this, idx ), result );
   }
  else
  {
     if( idx == ( this->get_clauses().size() + 1 ) )
      {
       makeEdge( this->get_body()->cfgForEnd(), CFGNode( this, idx ) , result ); //connect variables clauses first, then parallel body
      }
     else
      {
          if( idx < ( this->get_clauses().size() + 1 ) )
           {
              makeEdge( this->get_clauses()[idx -1]->cfgForEnd(), CFGNode( this, idx ), result );//connect variables clauses first, then parallel body
           }
          else
           {
            ROSE_ASSERT( !" Bad index for SgOmpClauseBodyStatement" );
           }
      }
  }

return result; }

 step3: construct cfgOutEdge for CFGNode
 For example:
   std::vector<CFGEdge> SgOmpClauseBodyStatement::cfgOutEdges(unsigned int idx) {//! edited by Hongyi for edges between SgOmpClauseBodyStatement  and SgOmpClause
  std::vector<CFGEdge> result;

addIncomingFortranGotos( this, idx, result ); if( idx == (this->get_clauses().size() + 1 ) )

 {
    makeEdge( CFGNode( this ,idx), getNodeJustAfterInContainer( this ), result );
 }
else
 {
   if( idx == this->get_clauses().size()  )
      {
        makeEdge( CFGNode( this, idx ), this->get_body()->cfgForBeginning(), result ); // connect variable clauses first, parallel body last
      }
     else
     {
       if( idx < this->get_clauses().size() ) // connect variables clauses first, parallel body last
         {
           makeEdge( CFGNode( this, idx ), this->get_clauses()[idx]->cfgForBeginning(), result );
         }
        else
         {
           ROSE_ASSERT( !"Bad index for SgOmpClauseBodyStatement" );
         }
      }
 }

return result; }

3. 如何检查结果

首先检查 AST 图 /Users/ma23/Desktop/Screen shot 2012-08-24 at 11.51.33 AM.png 在此示例中，您将发现从 SgOmpParallelStatement 有三个子树。一个是 get_body，另外两个分别是 SgOmpPrivateClasue 和 SgOmpSharedClauserespectively。因此索引为 3。// 访问 CFGNode 的顺序是先访问子句，然后是并行主体

静态控制流图

由于虚拟控制流图的性能问题，我们开发了另一个静态版本，它像普通图一样持久存在内存中。

事实

文档：ROSE 教程的**19.7 静态 CFG** pdf
测试目录：rose/tests/CompileTests/staticCFG_tests

静态和过程间 CFG

事实

文档：ROSE 教程的**19.8 静态、过程间 CFG** pdf
测试目录：rose/tests/CompileTests/staticCFG_tests

虚拟函数分析

事实

原始贡献者：来自 UTSA 的 Faizur，完成于 2011 年夏季
代码：位于 src/midend/programAnalysis/VirtualFunctionAnalysis。
使用以下论文中使用的技术实现："过程间指针别名分析 - http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.42.2382"。该论文将虚拟函数解析简化为指针别名问题。该论文采用流敏感过程间数据流分析来解决别名问题，使用紧凑表示图来表示别名关系。
ROSE 存储库的 roseTests 文件夹中有一些测试文件，他告诉我实现支持函数指针以及跨不同文件（头文件等）编写的代码。
文档：ROSE 教程 pdf 的第 24 章基于数据流分析的虚拟函数分析

定义-使用分析

如果您想要定义-使用分析，请尝试以下方法 http://www.rosecompiler.org/ROSE_HTML_Reference/classVariableRenaming.html

VariableRenaming v(project);
v.run();
v.getReachingDefsAtNode(...);

测试

cd buildtree/tests/nonsmoke/functional/roseTests/programAnalysisTests/defUseAnalysisTests
输入 ```make check```

活性分析

请参阅活性分析

指针分析

https://mailman.nersc.gov/pipermail/rose-public/2010-September/000390.html

On 9/1/10 11:49 AM, Fredrik Kjolstad wrote:
> Hi all,
>
> I am trying to use Rose as the analysis backend for a refactoring 
> engine and for one of the refactorings I am implementing I need 
> whole-program pointer analysis.  Rose has an implementation of 
> steensgard's algorithm and I have some questions regarding how to use 
> this.
>
> I looked at the file steensgaardTest2.C to figure out how to invoke 
> this analysis and I am a bit perplexed:
>
> 1. The file SteensgaardPtrAnal.h that is included by the test is not 
> present in the include directory of my installed version of Rose. 
>  Does this mean that the Steensgaard implementation is not a part of 
> the shipped compiler, or does it mean that I have to retrieve an 
> instance of it through some factory method whose static return type is 
> PtrAnal?

I believe it is in the shipped compiler. And you're using the correct 
file to figure out how to use it. It should be in the installed include 
directory --- if it is not, it's probably something that needs to be 
fixed. But you can copy the include file from 
ROSE/src/midend/programAnalysis/pointerAnal/ as a temporary fix

> 2. How do I initialize the alias analysis for a given SgProject?  Is 
> this done through the overloaded ()?

The steensgaardTest2.C file shows how to set up everything to invoke the 
analysis. Right now you need to go over each function definition and 
invoke the analysis explicitly, as illustrated by the main function in 
the file.

> 3. Say I want to query whether two pointer variables alias and I have 
> SGNodes to their declarations.  How do I get the AstNodePtr needed to 
> invoke the may_alias(AstInterface&, const AstNodePtr&, const 
> AstNodePtr&) function?  Or maybe I should rather invoke the version of 
> may_alias that takes two strings (varnames)?

To convert a SgNode* x to AstNodePtr, wrap it inside  an AstNodePtrImpl 
object, i.e., do AstNodePtrImpl(x), as illustrated inside the () 
operator of TestPtrAnal in steensgaardTest2.C.

> 4. How do I query whether two parameters alias?

The PtrAnal class has  the following interface method
     may_alias(AstInterface& fa, const AstNodePtr& r1, const AstNodePtr& 
r2);
It is implemented in SteensgaardPtrAnal class, which inherit PtrAnal 
class. To build AstInterface and AstNodePtr,
you simply need to wrap SgNode* with some wrapper classes, illustrated 
by steensgaardTest2.C

- Qing Yi

void func(void) {
int* pointer;
int* aliasPointer;

pointer = malloc(sizeof(int));
aliasPointer = pointer;
*aliasPointer = 42;

printf("%d\n", *pointer);
}

The SteensgaardPtrAnal::output function returns:
c:(sizeof(int )) LOC1=>LOC2
c:42 LOC3=>LOC4
v:func LOC5=>LOC6 (inparams: ) ->(outparams:  LOC7)
v:func-0 LOC8=>LOC7
v:func-2-1 LOC9=>LOC10
v:func-2-3 LOC11=>LOC12 (pending  LOC10 LOC13=>LOC14 =>LOC4 )
v:func-2-4 LOC15=>LOC16 =>LOC17
v:func-2-5 LOC18=>LOC14 =>LOC4
v:func-2-aliasPointer LOC19=>LOC14 =>LOC4
v:func-2-pointer LOC20=>LOC13 =>LOC14 =>LOC4
v:malloc LOC21=>LOC22 (inparams:  LOC2) ->(outparams:  LOC12)
v:printf LOC23=>LOC24 (inparams:  LOC16=>LOC17  LOC14=>LOC4 ) ->(outparams:
 LOC25)

SSA

ROSE 已实现 SSA 形式。邮件列表中的一些讨论：链接。

Rice 分支有一个数组 SSA 的实现。我们正在等待他们的提交被推送到 Jenkins。--Liao (讨论 • 贡献) 2012 年 6 月 19 日 18:17 (UTC)

副作用分析

ROSE 中至少有两种实现

第一种：推荐使用。

SageInterface 接口函数内部：http://rosecompiler.org/ROSE_HTML_Reference/namespaceSageInterface.html
- 例如 bool SageInterface::collectReadWriteVariables (SgStatement *stmt, std::set< SgInitializedName * > &readVars, std::set< SgInitializedName * > &writeVars) // 可以将 for 循环及其主体（基本块语句）传递给此函数，并获取读/写变量。如果分析不成功，它将返回 false。因此，在使用结果之前务必检查返回值。
- 此函数是以下函数的包装函数
- bool AnalyzeStmtRefs(LoopTransformInterface &la, const AstNodePtr& n,CollectObject<AstNodePtr> &wRefs, CollectObject<AstNodePtr> &rRefs) // 来自 DepInfoAnal.C
- StmtSideEffectCollect(LoopTransformInterface::getSideEffectInterface())(fa,n,&colw,&colr); //src/midend/astUtil/astSupport/StmtInfoCollect.h

第二种：这种方法不够健壮，无法使用。它还依赖 sqlite 库进行安装。

源代码：src/midend/programAnalysis/sideEffectAnalysis
测试：tests/roseTests/programAnalysisTests/sideEffectAnalysisTests
该算法基于以下论文：K. D. Cooper 和 K. Kennedy。1988 年。线性时间内的过程间副作用分析。在程序设计语言设计和实现 (PLDI '88) 1988 年 ACM SIGPLAN 会议论文集中，R. L. Wexelblat（编辑）。ACM，纽约，纽约，美国，57-66。

通用数据流框架

随着 ROSE 项目的进行，我们收集了相当多的数据流分析版本。维护和使用它们很痛苦，因为它们

重复迭代不动点算法
分散在不同的目录中，并且
使用不同的表示来表示结果。

一个持续的努力是将所有数据流分析工作整合到一个框架中。

快速事实

原始作者：Greg Bronevetsky
代码审阅者：廖春华
文档
源代码：./src/midend/programAnalysis/genericDataflow 下的文件
测试：tests/roseTests/programAnalysisTests/generalDataFlowAnalysisTests
当前实现的分析
- 支配者分析：dominatorAnalysis.h dominatorAnalysis.C
- 活跃/死变量分析，或活跃性分析：liveDeadVarAnalysis.h liveDeadVarAnalysis.C
- 常量传播：constantPropagation.h constantPropagation.C：TODO 需要将文件从 /tests 移动到 src/

在通用数据流框架中查看更多内容。

依赖分析

TODO：事实证明，接口工作尚未合并到我们的主分支中。因此，以下说明不适用！

依赖图的接口可以在 DependencyGraph.h 中找到。底层表示是 n DepGraph.h。需要 BGL 来访问图。

这里附有这封电子邮件的 6 个示例。在 deptest.C 中，还有一些宏可以实现更准确的分析。

如果定义了 USE_IVS，则将执行归纳变量替换。如果定义了 USE_FUNCTION，则依赖项可以采用用户指定的函数副作用接口。否则，如果两者都没有定义，它将执行正常的依赖分析并构建图形。