评论

收藏

[PHP] Rust源码剖析:Lint

开发技术 开发技术 发布于:2022-09-01 15:00 | 阅读数:542 | 评论:0

      
  • 作者:张正@KusionStack开发组  
  • 原文:https://mp.weixin.qq.com/s/qV-0nXhkuNjzsRTFceHkwQ
> 摘要:前一篇文章 介绍了关于 Lint 和 LintPass 的一些概念和实现。基于这些结构,提供了一个简易的 Lint 检查的实现方式。本文主要介绍 CombinedLintPass 这一结构的实现,并基于 CombinedLintPass 进一步优化 Lint 的实现。
背景
在 KusionStack 技术栈中, KCL 配置策略语言是重要的组成部分之一。为了帮助用户更好的编写 KCL 代码,我们也为 KCL 语言开发了一些语言工具,Lint 就是其中一种。Lint 工具帮助用户检查代码中潜在的问题和错误,同时也可以用于自动化的代码检查,保障仓库代码规范和质量。因为 KCL 语言由 Rust 实现,一些功能也学习和参考了 Rustc。本文是在学习 Rustc 过程中的一些思考和沉淀,在这里做一些分享。
前一篇文章 介绍了关于 Lint 和 LintPass 的一些概念和实现。基于这些结构,提供了一个简易的 Lint 检查的实现方式。本文主要介绍 CombinedLintPass 这一结构的实现,并基于 CombinedLintPass 进一步优化 Lint 的实现。
CombinedLintpass
Rustc 在 LintPass 的中实现了 Lint 工具检查的具体逻辑。并且使用 Visitor 模式遍历 AST 的同时调用 lintpass 中的 check_*方法。
impl ast_visit::Visitor for Linter {
  fn visit_crate(a: ast:crate){
    for lintpass in lintpasses{
      lintpass.check_crate(a)
    }
    walk_crate();
  }
  fn visit_stmt(a: ast:stmt){
    for lintpass in lintpasses{
      lintpass.check_stmt(a)
    }
    walk_stmt();
  }
  ...
}
但是,Rustc 自身和 clippy 提供的 Lint 定义多达550+多个。考虑到性能因素,定义大量的 LintPass,分别注册和调用显然是不合适的。Rustc 提供了一种更优的解决方法:既然可以将多个 Lint 组织为一个 LintPass,同样也可以将多个 LintPass 组合成一个 CombinedLintPass。
> Compiler lint passes are combined into one pass > Within the compiler, for performance reasons, we usually do not register dozens of lint passes. Instead, we have a single lint pass of each variety (e.g., BuiltinCombinedModuleLateLintPass) which will internally call all of the individual lint passes; this is because then we get the benefits of static over dynamic dispatch for each of the (often empty) trait methods. > Ideally, we'd not have to do this, since it adds to the complexity of understanding the code. However, with the current type-erased lint store approach, it is beneficial to do so for performance reasons.
BuiltinCombinedEarlyLintPass
CombinedLintPass 同样分为 early 和 late 两类。 以 builtin 的 early lint 为例,Rustc 在 rustc_lint::src::lib.rs 中为这些 lintpass 定义了一个 BuiltinCombinedEarlyLintPass 结构。
early_lint_passes!(declare_combined_early_pass, [BuiltinCombinedEarlyLintPass]);
虽然这个定义看起来只有一行,但其中通过若干个宏的展开,汇总了14个 LintPass,并且每个 LintPass 提供了50多个 check_* 方法。接下来一一说明这些宏。
BuiltinCombinedEarlyLintPass 的宏定义
early_lint_passes
macro_rules! early_lint_passes {
  ($macro:path, $args:tt) => {
    $macro!(
      $args,
      [
        UnusedParens: UnusedParens,
        UnusedBraces: UnusedBraces,
        UnusedImportBraces: UnusedImportBraces,
        UnsafeCode: UnsafeCode,
        AnonymousParameters: AnonymousParameters,
        EllipsisInclusiveRangePatterns: EllipsisInclusiveRangePatterns::default(),
        NonCamelCaseTypes: NonCamelCaseTypes,
        DeprecatedAttr: DeprecatedAttr::new(),
        WhileTrue: WhileTrue,
        NonAsciiIdents: NonAsciiIdents,
        HiddenUnicodeCodepoints: HiddenUnicodeCodepoints,
        IncompleteFeatures: IncompleteFeatures,
        RedundantSemicolons: RedundantSemicolons,
        UnusedDocComment: UnusedDocComment,
      ]
    );
  };
}
首先是 early_lint_passes 宏,这个宏的主要作用是定义了所有的 early lintpass。这里的 lintpass 是成对出现的,:左边为 lintpass 的 Identifier,:右边为 lintpass 的constructor。所以会出现 EllipsisInclusiveRangePatterns::default() 和 DeprecatedAttr::new()这种形式。early_lint_passes 会将定义的 early lintpass 和 第二个参数一起传递给下一个宏。 通过这个宏,之前的BuiltinCombinedEarlyLintPass的定义被展开为:
declare_combined_early_pass!([BuiltinCombinedEarlyLintPass], [
        UnusedParens: UnusedParens,
        UnusedBraces: UnusedBraces,
        UnusedImportBraces: UnusedImportBraces,
        UnsafeCode: UnsafeCode,
        AnonymousParameters: AnonymousParameters,
        EllipsisInclusiveRangePatterns: EllipsisInclusiveRangePatterns::default(),
        NonCamelCaseTypes: NonCamelCaseTypes,
        DeprecatedAttr: DeprecatedAttr::new(),
        WhileTrue: WhileTrue,
        NonAsciiIdents: NonAsciiIdents,
        HiddenUnicodeCodepoints: HiddenUnicodeCodepoints,
        IncompleteFeatures: IncompleteFeatures,
        RedundantSemicolons: RedundantSemicolons,
        UnusedDocComment: UnusedDocComment,
      ])
declare_combined_early_pass
macro_rules! declare_combined_early_pass {
  ([$name:ident], $passes:tt) => (
    early_lint_methods!(declare_combined_early_lint_pass, [pub $name, $passes]);
  )
}
declare_combined_early_pass 宏接收 early_lint_passes宏传来的 name(BuiltinCombinedEarlyLintPass) 和 passes,并继续传递给 early_lint_methods 宏。 通过这个宏,BuiltinCombinedEarlyLintPass的定义继续展开为:
early_lint_methods!(declare_combined_early_lint_pass, 
          [pub BuiltinCombinedEarlyLintPass, 
            [
              UnusedParens: UnusedParens,
              UnusedBraces: UnusedBraces,
              UnusedImportBraces: UnusedImportBraces,
              UnsafeCode: UnsafeCode,
              AnonymousParameters: AnonymousParameters,
              EllipsisInclusiveRangePatterns: EllipsisInclusiveRangePatterns::default(),
              NonCamelCaseTypes: NonCamelCaseTypes,
              DeprecatedAttr: DeprecatedAttr::new(),
              WhileTrue: WhileTrue,
              NonAsciiIdents: NonAsciiIdents,
              HiddenUnicodeCodepoints: HiddenUnicodeCodepoints,
              IncompleteFeatures: IncompleteFeatures,
              RedundantSemicolons: RedundantSemicolons,
              UnusedDocComment: UnusedDocComment,
         ]
          ]);
early_lint_methods
macro_rules! early_lint_methods {
  ($macro:path, $args:tt) => (
    $macro!($args, [
      fn check_param(a: &ast::Param);
      fn check_ident(a: &ast::Ident);
      fn check_crate(a: &ast::Crate);
      fn check_crate_post(a: &ast::Crate);
      ...
    ]);
  )
}
early_lint_methods 宏在前一篇文章中也介绍过,它定义了 EarlyLintPass 中需要实现的 check_*函数,并且将这些函数以及接收的参数 $args传递给下一个宏。因为 BuiltinCombinedEarlyLintPass 也是 early lint 的一种,所以同样需要实现这些函数。 通过这个宏,BuiltinCombinedEarlyLintPass的定义继续展开为:
declare_combined_early_lint_pass!(
  [pub BuiltinCombinedEarlyLintPass, 
    [
      UnusedParens: UnusedParens,
      UnusedBraces: UnusedBraces,
      UnusedImportBraces: UnusedImportBraces,
      UnsafeCode: UnsafeCode,
      AnonymousParameters: AnonymousParameters,
      EllipsisInclusiveRangePatterns: EllipsisInclusiveRangePatterns::default(),
      NonCamelCaseTypes: NonCamelCaseTypes,
      DeprecatedAttr: DeprecatedAttr::new(),
      WhileTrue: WhileTrue,
      NonAsciiIdents: NonAsciiIdents,
      HiddenUnicodeCodepoints: HiddenUnicodeCodepoints,
      IncompleteFeatures: IncompleteFeatures,
      RedundantSemicolons: RedundantSemicolons,
      UnusedDocComment: UnusedDocComment,
    ]
  ],
  [
    fn check_param(a: &ast::Param);
    fn check_ident(a: &ast::Ident);
    fn check_crate(a: &ast::Crate);
    fn check_crate_post(a: &ast::Crate);
    ...
  ]
)
declare_combined_early_lint_pass
macro_rules! declare_combined_early_lint_pass {
  ([$v:vis $name:ident, [$($passes:ident: $constructor:expr,)*]], $methods:tt) => (
    #[allow(non_snake_case)]
    $v struct $name {
      $($passes: $passes,)*
    }
    impl $name {
      $v fn new() -> Self {
        Self {
          $($passes: $constructor,)*
        }
      }
      $v fn get_lints() -> LintArray {
        let mut lints = Vec::new();
        $(lints.extend_from_slice(&$passes::get_lints());)*
        lints
      }
    }
    impl EarlyLintPass for $name {
      expand_combined_early_lint_pass_methods!([$($passes),*], $methods);
    }
    #[allow(rustc::lint_pass_impl_without_macro)]
    impl LintPass for $name {
      fn name(&self) -> &'static str {
        panic!()
      }
    }
  )
}
declare_combined_early_lint_pass宏是生成 BuiltinCombinedEarlyLintPass 的主体。这个宏中做了以下工作:
      
  • 生成一个名为 BuiltinCombinedEarlyLintPass 的 struct,其中的属性为宏 early_lint_passes 提供的 lintpass 的 identifier。  
  • 实现 fn new() fn name() 和 fn get_lints() 方法。其中 new() 调用了 early_lint_passes 提供的 lintpass 的 constructor。  
  • 调用宏 expand_combined_early_lint_pass_methods,实现自身的 check_* 方法。
通过这个宏,BuiltinCombinedEarlyLintPass的定义变为:
pub struct BuiltinCombinedEarlyLintPass {
      UnusedParens: UnusedParens,
      UnusedBraces: UnusedBraces,
      UnusedImportBraces: UnusedImportBraces,
      UnsafeCode: UnsafeCode,
      AnonymousParameters: AnonymousParameters,
      EllipsisInclusiveRangePatterns: EllipsisInclusiveRangePatterns,
      NonCamelCaseTypes: NonCamelCaseTypes,
      DeprecatedAttr: DeprecatedAttr,
      WhileTrue: WhileTrue,
      NonAsciiIdents: NonAsciiIdents,
      HiddenUnicodeCodepoints: HiddenUnicodeCodepoints,
      IncompleteFeatures: IncompleteFeatures,
      RedundantSemicolons: RedundantSemicolons,
      UnusedDocComment: UnusedDocComment,
}
impl BuiltinCombinedEarlyLintPass {
  pub fn new() -> Self {
    Self {
      UnusedParens: UnusedParens,
      UnusedBraces: UnusedBraces,
      UnusedImportBraces: UnusedImportBraces,
      UnsafeCode: UnsafeCode,
      AnonymousParameters: AnonymousParameters,
      EllipsisInclusiveRangePatterns: EllipsisInclusiveRangePatterns::default(),
      NonCamelCaseTypes: NonCamelCaseTypes,
      DeprecatedAttr: DeprecatedAttr::new(),
      WhileTrue: WhileTrue,
      NonAsciiIdents: NonAsciiIdents,
      HiddenUnicodeCodepoints: HiddenUnicodeCodepoints,
      IncompleteFeatures: IncompleteFeatures,
      RedundantSemicolons: RedundantSemicolons,
      UnusedDocComment: UnusedDocComment,
    }
  }
  pub fn get_lints() -> LintArray {
    let mut lints = Vec::new();
    lints.extend_from_slice(&UnusedParens::get_lints());
    lints.extend_from_slice(&UnusedBraces::get_lints());
    lints.extend_from_slice(&UnusedImportBraces::get_lints());
    lints.extend_from_slice(&UnsafeCode::get_lints());
    lints.extend_from_slice(&AnonymousParameters::get_lints());
    lints.extend_from_slice(&EllipsisInclusiveRangePatterns::get_lints());
    lints.extend_from_slice(&NonCamelCaseTypes::get_lints());
    lints.extend_from_slice(&DeprecatedAttr::get_lints());
    lints.extend_from_slice(&WhileTrue::get_lints());
    lints.extend_from_slice(&NonAsciiIdents::get_lints());
    lints.extend_from_slice(&HiddenUnicodeCodepoints::get_lints());
    lints.extend_from_slice(&IncompleteFeatures::get_lints());
    lints.extend_from_slice(&RedundantSemicolons::get_lints());
    lints.extend_from_slice(&UnusedDocComment::get_lints());
    
    lints
  }
}
impl EarlyLintPass for BuiltinCombinedEarlyLintPass {
  expand_combined_early_lint_pass_methods!([$($passes),*], $methods);
}
#[allow(rustc::lint_pass_impl_without_macro)]
impl LintPass for BuiltinCombinedEarlyLintPass {
  fn name(&self) -> &'static str {
    panic!()
  }
}
expand_combined_early_lint_pass_methods
macro_rules! expand_combined_early_lint_pass_methods {
  ($passes:tt, [$($(#[$attr:meta])* fn $name:ident($($param:ident: $arg:ty),*);)*]) => (
    $(fn $name(&mut self, context: &EarlyContext<'_>, $($param: $arg),*) {
      expand_combined_early_lint_pass_method!($passes, self, $name, (context, $($param),*));
    })*
  )
}
expand_combined_early_lint_pass_methods宏在 BuiltinCombinedEarlyLintPass 中展开所有 early_lint_methods 中定义的方法。 通过这个宏,BuiltinCombinedEarlyLintPass的定义变为(省略其他定义):
impl EarlyLintPass for BuiltinCombinedEarlyLintPass {
  fn check_param(&mut self, context: &EarlyContext<'_>, a: &ast::Param) {
    expand_combined_early_lint_pass_method!($passes, self, $name, (context, $($param),*));
  }
  fn check_ident(&mut self, context: &EarlyContext<'_>, a: &ast::Ident) {
    expand_combined_early_lint_pass_method!($passes, self, $name, (context, $($param),*));
  }
  fn check_crate(&mut self, context: &EarlyContext<'_>, a: &ast::Crate) {
    expand_combined_early_lint_pass_method!($passes, self, $name, (context, $($param),*));
  }
  ...
  
}
expand_combined_early_lint_pass_method
macro_rules! expand_combined_early_lint_pass_method {
  ([$($passes:ident),*], $self: ident, $name: ident, $params:tt) => ({
    $($self.$passes.$name $params;)*
  })
}
expand_combined_early_lint_pass_method:在展开的check_* 函数中调用每一个 LintPass 的 check_*。 通过这个宏,BuiltinCombinedEarlyLintPass的定义变为(省略其他定义):
impl EarlyLintPass for BuiltinCombinedEarlyLintPass {
  fn check_param(&mut self, context: &EarlyContext<'_>, a: &ast::Param) {
    self.UnusedParens.check_param(context, a);
    self.UnusedBraces.check_param(context, a);
    self.UnusedImportBraces.check_param(context, a);
    ...
  }
  fn check_ident(&mut self, context: &EarlyContext<'_>, a: &ast::Ident) {
    self.UnusedParens.check_ident(context, a);
    self.UnusedBraces.check_ident(context, a);
    self.UnusedImportBraces.check_ident(context, a);
    ...
  }
  fn check_crate(&mut self, context: &EarlyContext<'_>, a: &ast::Crate) {
    self.UnusedParens.check_crate(context, a);
    self.UnusedBraces.check_crate(context, a);
    self.UnusedImportBraces.check_crate(context, a);
    ...
  }
  ...
  
}
BuiltinCombinedEarlyLintPass 的最终定义
通过以上宏的展开,BuiltinCombinedEarlyLintPass的定义实际为如下形式:
pub struct BuiltinCombinedEarlyLintPass {
  UnusedParens: UnusedParens,
  UnusedBraces: UnusedBraces,
  ...
}
impl BuiltinCombinedEarlyLintPass{
  pub fn new() -> Self {
    UnusedParens: UnusedParens,
    UnusedBraces: UnusedBraces,
    ...
  }
  
  pub fn get_lints() -> LintArray {
    let mut lints = Vec::new();
    lints.extend_from_slice(&UnusedParens::get_lints());
    lints.extend_from_slice(&UnusedBraces::get_lints());
    ...
    lints
  }
}
impl EarlyLintPass for BuiltinCombinedEarlyLintPass {
  fn check_crates(&mut self, context: &EarlyContext<'_>, a: &ast::Crate){
    self.UnusedParens.check_crates (context, a);
    self.UnusedBraces.check_crates (context, a);
    ...
  }
  fn check_ident(&mut self, context: &EarlyContext<'_>, a: Ident){
    self.UnusedParens.check_ident (context, a);
    self.UnusedBraces.check_ident (context, a);
    ...
  }
  .. 
}
通过这个定义,可以在遍历 AST 时使用 BuiltinCombinedEarlyLintPass 的 check_* 方法实现多个 lintpass 的检查。
Lint 的进一步优化
基于 CombinedLintPass ,可以对上一篇文章中提出的 Linter 的设计做进一步优化。 DSC0000.jpg
这里,可以用 CombinedLintPass 的check_* 方法,在 Visitor 遍历 AST 时执行对应的检查。虽然效果与之前一致,但因为宏的关系,所有的 check_* 方法和需要执行的 lintpass 都被收集到了一个结构中,也更容易管理。同样的,因为 CombinedLintPass 实际上调用的是每个 lintpass 各自的 check 方法,虽然调用起来可能下图一样很复杂,但因为 lintpass 中定义的 check 方法大部分是由宏生成的空检查,所以也不会造成性能上的损失。 DSC0001.jpg
总结
本文简单介绍了 Rustc 源码中关于 CombinedLintPass 这一结构的定义和实现 ,并以此进一步优化 Linter 的设计。希望能够对理解 Rustc 及 Lint 有所帮助,如有错误,欢迎指正。后续的文章将继续介绍 Rustc 中 Lint 在编译过程中的注册和执行过程,期待继续关注。
参考链接

      
  • Rust源码剖析(电子书):https://github.com/awesome-kusion/rust-code-book  
  • KCL 配置策略语言:https://github.com/KusionStack/KCLVM  
  • KusionStack: https://github.com/KusionStack  
  • Rustc: https://github.com/rust-lang/rust  
  • rustc-dev-guide: https://rustc-dev-guide.rust-lang.org/  
  • Rust Visitor: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/visit/index.html  
  • Rust Clippy: https://github.com/rust-lang/rust-clippy