I keep reading everywhere that rust is fast, has memory safety guardrails, and is 1 of the most loved languages lately. After reading the book and dreaming about understanding the language, I decided to build a static analyzer for the language I've been using for over a decade. I will be naming it Phanalist and the code will also be available on
https://github.com/denzyldick/phanalist.
To see if this idea would work, I made a list of 4 common bad practices/mistakes I often see. As you can see in the code below, we can all agree that the class doesn't make you happy.
<?php
class uTesting extends FakeClass
{
const I_ = null;
const hello = null;
$no_= null, $no_modifier = null;
public function __construct($o)
{
$this->fake_variable = 'hellworld';
}
function test($a){
return 1;
}
}
To keep it easy, I will first find these four common mistakes that are easy to spot. This list will continue to grow in the future.
- Misplaced opening tag.
- Class name that starts in lowercase
- Lowercase constants
- Defining parameters without a type.
The static analyzer should be beginner friendly. Instead of making the developer's life harder, it should be able to explain the error in a way that the developer should understand. But first, let us focus on the first steps.
I will use PHP-Parser
a rust library that can parse PHP code and generate an AST(abstract syntax tree). The parser's output for the PHP code above will be a vector(Vec<Statement
) containing all the statements.
[
FullOpeningTag(
Span {
line: 1,
column: 1,
position: 0,
},
),
Class(
ClassStatement {
attributes: [],
modifiers: ClassModifierGroup {
modifiers: [],
},
class: Span {
line: 2,
column: 2,
position: 7,
},
name: SimpleIdentifier {
span: Span {
line: 2,
column: 8,
position: 13,
},
value: "uTesting",
},
extends: Some(
ClassExtends {
extends: Span {
line: 2,
column: 17,
position: 22,
},
parent: SimpleIdentifier {
span: Span {
line: 2,
column: 25,
position: 30,
},
value: "FakeClass",
},
},
),
implements: None,
body: ClassBody {
left_brace: Span {
line: 3,
column: 3,
position: 42,
},
members: [
Constant(
ClassishConstant {
comments: CommentGroup {
comments: [],
},
attributes: [],
modifiers: ConstantModifierGroup {
modifiers: [],
},
const: Span {
line: 4,
column: 5,
position: 48,
},
entries: [
ConstantEntry {
name: SimpleIdentifier {
span: Span {
line: 4,
column: 11,
position: 54,
},
value: "I_",
},
equals: Span {
line: 4,
column: 14,
position: 57,
},
value: Null,
},
],
semicolon: Span {
line: 4,
column: 20,
position: 63,
},
},
),
Constant(
ClassishConstant {
comments: CommentGroup {
comments: [],
},
attributes: [],
modifiers: ConstantModifierGroup {
modifiers: [],
},
const: Span {
line: 5,
column: 5,
position: 69,
},
entries: [
ConstantEntry {
name: SimpleIdentifier {
span: Span {
line: 5,
column: 11,
position: 75,
},
value: "hello",
},
equals: Span {
line: 5,
column: 17,
position: 81,
},
value: Null,
},
],
semicolon: Span {
line: 5,
column: 23,
position: 87,
},
},
),
Property(
Property {
attributes: [],
modifiers: PropertyModifierGroup {
modifiers: [],
},
type: None,
entries: [
Initialized {
variable: SimpleVariable {
span: Span {
line: 6,
column: 5,
position: 93,
},
name: "$no_",
},
equals: Span {
line: 6,
column: 9,
position: 97,
},
value: Null,
},
Initialized {
variable: SimpleVariable {
span: Span {
line: 6,
column: 17,
position: 105,
},
name: "$no_modifier",
},
equals: Span {
line: 6,
column: 30,
position: 118,
},
value: Null,
},
],
end: Span {
line: 6,
column: 36,
position: 124,
},
},
),
ConcreteConstructor(
ConcreteConstructor {
comments: CommentGroup {
comments: [],
},
attributes: [],
modifiers: MethodModifierGroup {
modifiers: [
Public(
Span {
line: 8,
column: 5,
position: 131,
},
),
],
},
function: Span {
line: 8,
column: 12,
position: 138,
},
ampersand: None,
name: SimpleIdentifier {
span: Span {
line: 8,
column: 21,
position: 147,
},
value: "__construct",
},
parameters: ConstructorParameterList {
comments: CommentGroup {
comments: [],
},
left_parenthesis: Span {
line: 8,
column: 32,
position: 158,
},
parameters: CommaSeparated {
inner: [
ConstructorParameter {
attributes: [],
comments: CommentGroup {
comments: [],
},
ampersand: None,
name: SimpleVariable {
span: Span {
line: 8,
column: 33,
position: 159,
},
name: "$o",
},
data_type: None,
ellipsis: None,
default: None,
modifiers: PromotedPropertyModifierGroup {
modifiers: [],
},
},
],
commas: [],
},
right_parenthesis: Span {
line: 8,
column: 35,
position: 161,
},
},
body: MethodBody {
comments: CommentGroup {
comments: [],
},
left_brace: Span {
line: 9,
column: 5,
position: 167,
},
statements: [
Expression(
ExpressionStatement {
expression: AssignmentOperation(
Assign {
left: PropertyFetch {
target: Variable(
SimpleVariable(
SimpleVariable {
span: Span {
line: 10,
column: 7,
position: 175,
},
name: "$this",
},
),
),
arrow: Span {
line: 10,
column: 12,
position: 180,
},
property: Identifier(
SimpleIdentifier(
SimpleIdentifier {
span: Span {
line: 10,
column: 14,
position: 182,
},
value: "fake_variable",
},
),
),
},
equals: Span {
line: 10,
column: 28,
position: 196,
},
right: Literal(
String(
LiteralString {
value: "'hellworld'",
span: Span {
line: 10,
column: 30,
position: 198,
},
},
),
),
},
),
ending: Semicolon(
Span {
line: 10,
column: 41,
position: 209,
},
),
},
),
],
right_brace: Span {
line: 11,
column: 5,
position: 215,
},
},
},
),
ConcreteMethod(
ConcreteMethod {
comments: CommentGroup {
comments: [],
},
attributes: [],
modifiers: MethodModifierGroup {
modifiers: [],
},
function: Span {
line: 13,
column: 5,
position: 222,
},
ampersand: None,
name: SimpleIdentifier {
span: Span {
line: 13,
column: 14,
position: 231,
},
value: "test",
},
parameters: FunctionParameterList {
comments: CommentGroup {
comments: [],
},
left_parenthesis: Span {
line: 13,
column: 18,
position: 235,
},
parameters: CommaSeparated {
inner: [
FunctionParameter {
comments: CommentGroup {
comments: [],
},
name: SimpleVariable {
span: Span {
line: 13,
column: 19,
position: 236,
},
name: "$a",
},
attributes: [],
data_type: None,
ellipsis: None,
default: None,
ampersand: None,
},
],
commas: [],
},
right_parenthesis: Span {
line: 13,
column: 21,
position: 238,
},
},
return_type: None,
body: MethodBody {
comments: CommentGroup {
comments: [],
},
left_brace: Span {
line: 13,
column: 22,
position: 239,
},
statements: [
Return(
ReturnStatement {
return: Span {
line: 14,
column: 7,
position: 247,
},
value: Some(
Literal(
Integer(
LiteralInteger {
value: "1",
span: Span {
line: 14,
column: 14,
position: 254,
},
},
),
),
),
ending: Semicolon(
Span {
line: 14,
column: 15,
position: 255,
},
),
},
),
],
right_brace: Span {
line: 15,
column: 5,
position: 261,
},
},
},
),
],
right_brace: Span {
line: 17,
column: 3,
position: 266,
},
},
},
),
]
We will be finding the four common mistakes I defined before by navigating the tree. Let's start with finding the first mistake:
<?php
According to the PSR-2 coding standard, the PHP opening tag should always be at the beginning of the file. When you don't
do this, the white space will be sent to the client before executing your PHP code. Resulting in the header already sent
error. This mistake will be easy to find in the AST. We can iterate through the items in the vector and use pattern matching to find the FullOpeningTag
statement.
for statement in statements {
match statement {
Statement::FullOpeningTag(tag) => project.opening_tag(tag.span, file),
}
}
As you can see in the AST, the information we need is stored in the struct Span
.
FullOpeningTag(
Span {
line: 1,
column: 1,
position: 0,
},
),
We only need to check if the field line and column are higher than 0. If they are, it means the opening tag is not in the correct position, and we will push
a suggestion into the field suggestion
of the file
parameter that is being passed to the opening_tag
function.
pub fn opening_tag(&mut self, span: Span, file: &mut File) -> &mut Project {
if span.line > 1 {
file.suggestions.push(
Suggestion::from(
"The opening tag <?php is not on the right line. This should always be the first line in a PHP file.".to_string(),
span
))
}
if span.column > 1 {
file.suggestions.push(Suggestion::from(
format!(
"The opening tag doesn't start at the right column: {}.",
t.column
)
.to_string(),
span,
));
}
self
}
I won't explain how to navigate to the correct statement in the AST for the rest of the mistakes.
Class name that starts in lowercase.
class uTesting extends FakeClass
When you do this, it's harder to distinguish a class between a variable and a method.
I need to find out if the first letter of the name of the class is capitalized. The String
type has a method chars
that can convert the string into an iterator containing all the letters. You can grab the first character with the next()
function. The char
type has some valuable methods. The one we need is is_uppercase()
.
pub fn has_capitalized_name(name: String, span: Span) -> Option<Suggestion> {
if !name.chars().next().unwrap().is_uppercase() {
Some(Suggestion::from(
format!("The class name {} is not capitalized. The first letter of the name of the class should be in uppercase.", name).to_string(),
span
));
}
None
}
Lowercase constants
const I_ = null;
const hello = null;
Similar to the capitalized class name, it's easier to distinguish a normal variable from a constant if the constant is in uppercase.
I assume the constant is already upper cased, so I initialize the variable is_uppercase = true
. When iterating through all of the letters, if I see a letter that is not in uppercase, I set the is_uppercase = false
.
pub fn uppercased_constant_name(entry: ConstantEntry) -> bool {
match entry {
ConstantEntry {
name,
equals,
value,
} => {
let mut is_uppercase = true;
for l in name.value.to_string().chars() {
if l.is_uppercase() == false && l.is_alphabetic() {
is_uppercase = l.is_uppercase()
}
}
return is_uppercase;
}
}
}
Defining parameters without a type.
I strongly advocate always defining the type for every parameter you declare. You need to define the type to avoid opening the gates to a bad interpretation of your code and introducing unnecessary bugs. In the AST, the FunctionParameter
contains the parameter. We can pattern match the data_type
field and the arm that has the None
will need to return true. We will return the false for the Some(_)
arm.
pub fn function_parameter_without_type(parameter: FunctionParameter) -> bool {
match parameter {
FunctionParameter {
comments,
name,
attributes,
data_type,
ellipsis,
default,
ampersand,
} => match data_type {
None => return true,
Some(_) => return false,
},
}
}
Conclusion
In part 2, I will show how to calculate the cyclomatic complexity of an example code.
If you are a PHP developer, you know that a developer can make many more mistakes in PHP. In the future, this list will continue to grow into something more useful.
Thanks for reading!
Contribution is always welcome if you have a mistake you would like to add to phanalist.
Top comments (0)