Konstantin Knizhnik: static analysis, a view from aside

Konstantin Knizhnik: static analysis, a
view from aside
Author: Andrey Karpov

Date: 10.01.2009

Abstract
The article is an interview with Konstantin Knizhnik taken by Andrey Karpov, "Program Verification
Systems" company's worker. In this interview the issues of static code analysis, relevance of solutions
made in this sphere and prospects of using static analysis while developing applications are discussed.

Introduction
OOO "Program Verification Systems" developing tools in the sphere of program testing and verification
asked Konstantin Knizhnik, a specialist in the sphere of static code analysis methodology, to answer
some questions. The interview has been performed and presented in the form of this article by Andrey
Karpov, OOO "Program Verification Systems" worker.

The interview touches upon issues of static code analysis and relevance of solutions made in this sphere.
Prospects of using static analysis while developing parallel applications are also discussed. A side
evaluation of the static analysis tools Viva64 and VivaMP developed by OOO "Program Verification
Systems" is made. Besides, some common issues of program verification are discussed which, as we
hope, will be interesting for the readers who explore this sphere of application testing.

The questions are asked by (the questions are in bold):
Candidate of Physico-mathematical Sciences, Andrey Karpov - "Program Verification Systems"
Company's technical director; develops the static code analysis tools Viva64 and VivaMP for testing 64-
bit and parallel applications. The author of some articles on static code analysis.

The questions are answered by:
Candidate of Physico-mathematical Sciences, Konstantin Knizhnik - the author of some articles devoted
to static program code analysis, developer of Java-application verifiers; participated and continues to
participate in many interesting projects, for example, in WebAlta.

The interview's text

Is that true that you had investigated the subject of static analysis and
even participated in creating a static code analyzer for Java applications?
I really had investigated the subject of static analysis and program verification. It began in 1997 when I
wrote a small program jlint, a clint-analogue for Java.

Tell us about jlint in detail please.
The program consisted of two parts - a simplest static analyzer for languages with C-like syntax. As it is
known, there are a lot of places in C language's syntax, which lead to errors difficult to detect, for
example, "=" instead of "==", an empty loop body caused by a ";" put in a wrong place etc. I won't
enumerate further for I suppose that the problems are rather familiar.

The second part - that was jlint - is a more or less self-dependent semantic analyzer for Java. I didn't
want to get involved into writing my own Java-parser then, that's why I decided to read the already
compiled code byte and analyze it. Tests include reference to zero link, incompatible casts, identically
true or identically false expressions, accuracy loss and the like.

The most interesting feature in jlint was an ability to detect potential deadlocks in a program. Java has a
very simple locking mechanism - a synchronized method or a synchronized(expr) construction. On the
basis of analysis of these constructions an attempt was made to build the lock graph (where the nodes
are the resources locked) and to find loops in this graph. Of course it is impossible to build a precise
graph, that's why we tried to build an approximate graph using classes instead of concrete instances.
Unfortunately, this testing didn't work well in real projects and there were a lot of false responses.

How do the story of creating jlint and your work in TogetherSoft and
Borland on the similar projects relate? What was your duty in these
companies?
Some years later my jlint was noticed by the creator of TogetherSoft, the company which released the
UML modeling tool and headed for developing a complete IDE for Java. And I began working in
TogetherSoft. But at first I developed OODBMS, then Version Control System and only after that I took
part in developing the Java-verifier.

For this time everything was serious: the complete syntactic parse, data flow analysis and other features
were implemented. The number of audits was more than several hundreds. Some of them, of course,
were rather simple, but some others claimed for rudiments of artificial intelligence. For example, the
verifier detected errors of mixing up an index variable in embedded loops or performed detecting
standard sequences for working with some resource (open, read/write, close) and searching for places
where these sequences were broken.

In general, a lot was done, including rather original things. It's funny, for sure, when you launch the
verifier on the source texts, for example, on JDK and get about a dozen of critical errors of reference to a
zero address. In most cases it happens, of course, in the error handlers, that is in those places which are
never executed in reality.

What is interesting, there were no Java-verifiers on market then. So, there had been opportunities for
great success but somehow we didn't manage to use it.

TogetherSoft Company was sold to Borland Company. We had to include support of such languages as
C++, Delphi and Visual Basic, and provide integration with JBuilder and Eclipse. At long last, our verifier
did reach users but in a very poor form (generally because of the necessity to work on AST provided to
us by other subsystems, which worked too slow and didn't contain the information necessary for the
verifier). It was too late by that time for there were verifiers for nearly all popular IDE for Java. And

although few of them tried to perform such a deep analysis, in most cases everything was reduced to
doctoring the syntax, but these differences were not so easy to notice.

And then Borland Company was struck by the crisis and I have been working for several years already in
an absolutely different sphere.

What are doing now?
At present I am taking part in several projects at once. For example, in WebAlta Company my business is
search engines. Besides WebAlta, I participate in creating OODBMS for plug-in systems. There are some
more other projects.

And what about the further fate of JLint?
As my work in TogetherSoft and then in Borland was directly connected with program verification, I had
to give up my jlint. By the way, you can download the existing jlint from my site:
http://www.garret.ru/lang.html.

As specialist in the sphere of static analysis what observations and
pieces of advice could you share with us? What "Program Verification
Systems" Company should take into consideration continuing to develop
the static analyzers Viva64 and VivaMP?
Well, I'll try to briefly list the main conclusions I've made during the years devoted to the problem of
program verification.

1. Most errors in real large projects are found by the most "stupid" audits. Take an example of absence
of break in switch. It is very useful when such audits work in the development environment in
interactive mode (thus they immediately mark the unsafe places).

2. Messages should be divided according to the confidence level (May-Be-Null, Is-Null, ...) with the
possibility to turn off separate groups as well.

3. To perform a full analysis you need a symbolic calculator - to understand that i+1 > i. Of course, I am
aware of the overflow due to which this condition is not always true but the verifier's task is to search
for such suspicious places.

4. The worse a language is designed, the more work is there for the verifier - for example, C syntax
causes a lot of programmer's errors and any C/C++ programmer has faced this problem more than once.
In Java many of these defects were corrected, but still not all of them. Many our audits were busy trying
to detect enums (which have been absent in Java until recently) according to different heuristic rules
and providing them with something like type static control. Of course, all this turned out to be useless in
C#.

5. The most important in the verifier is to maintain a reasonable balance between suspiciousness and
"talkativeness". If I get several thousand messages on a small project, surely, I simply won't be able to
check them all. So, we need division into criticality degrees. But if we take some rather critical error, for
example, may-by-null which can be caused by code as follows:

if (x == null)

{

DoSomething();

}

x.InvokeSomeMethod();

we must understand that having checked the first several suspicious places without finding an error in
them, a programmer won't consider the remaining messages.

That's why the rules "It's better to say nothing than to say nonsense" and "if not sure, keep silent" are
very topical for the verifier.

What do you think about practicability and usefulness of creating static
analyzers such as VivaMP for verification of parallel programs?
New languages (similar to Java and C#) with explicit memory release and absence of address arithmetic
have made a program's behavior nearly determined and helped to get rid of millions of man-hours spent
on program debugging ("where does memory escape", "who deletes this variable" etc) and also to get
rid of tools like BoundsChecker whose task was to fight the abuse of C/C++ possibilities.

But unfortunately parallel programming - creation of multithread applications without which we cannot
solve even a simple task nowadays - deprives us from this determinism and casts us to those times when
a programmer had to spend too much time on debugging and launch tests for twenty-four hours in
order not to get convinced of absence of errors (for a test can show only their presence but not to prove
their absence) but mostly to clear his and the team leader's conscience.

Moreover, if earlier (and even now in C++) writing a parallel program demanded great efforts, in C#/Java
it is much easier to create a new thread. This seeming simplicity creates an illusion that parallel
programming is very simple, but unfortunately this is not so and as far as I know there are no parallelism
models allowing you to do the same thing as "garbage collection" for usual programs (of course, if not to
speak of merely functional languages where execution can be paralleled automatically without
distracting a programmer's attention).

If we cannot solve a task on the level of proper language design, we have to provide support with static
and dynamic verifiers. That's why I find the possibility of detecting deadlocks and race conditions in
programs one of the most important tasks of the modern verifiers.

You have read our articles "20 issues of porting C++ code on a 64-bit
platform", "32 OpenMP traps for C++ developers" and others. Could you
evaluate and comment on them?
Thank you very much for the links to these articles, I liked them. I'll send these links to my colleagues. I
think this information will be very useful for many of them, especially for young and inexperienced ones.
When I worked in Digital Equipment I came across the first 64-bit systems (DEC Alpha). Projects were
mainly connected with porting (from VMS/VAX on Digital Unix/Alpha and from RSX/PDP on
OpenVMS/Alhpa). That's why we had faced ourselves all those problems of porting on a platform of a

different capacity which you describe in your articles. The case was even more complicated because
Alpha required strict data deskewing.

Have you considered demo-versions of Viva64 and VivaMP? What can
you advise to make them more popular? What means of promoting them
on market can be successful in your opinion?
I haven't yet looked at Viva64 and VivaMP tools themselves but I promise that I will. But from my
experience of working with the verifier in TogetherSoft/Borland I can say, or even warn, that, as in any
commercial product, a verifier consists of nearly 10% of interesting ideas and algorithms and 90% of
rather boring things without which, unfortunately, a commercial product cannot exist:

• Integration with many (and ideally with all) popular development means. This is rather difficult
as it concerns not only the interface but the necessity of skilful handling of the program's
original internal presentation (AST) to make it fully integrate into IDE.
• Standalone mode (one's own parser, report generator etc);
• Ability of incremental work.
• Autofixes (an ability to automatically correct simple errors).
• Generation of various reports, diagrams, export to Excel etc.
• Integration with automatic building systems.
• Examples (on each message there should be one simple and clear example in each of the
supported languages);
• Documentation. On each message there should be interactive help explaining why this message
has been shown. Besides, you need the Users guide.
• Detailed and convenient "target designation". For example, if we say that a deadlock can occur
here we should show the user the whole possible way of the deadlock's occurrence.
• Scaling. The verifier should be able to process a project of several million strings for a
reasonable time.
• You need a good site with a forum, a blog of one of the leading developers and regularly
updated topical information. For example, in due time C-lint published in each Dr.Dobbs
magazine issue an advertisement with an example of a c-lint-error. It looked like a puzzle and
really attracted the public's attention to the product.

All in all, as everywhere, it turns out that it is rather easy to write some product (one person can do it in
several months). But to turn it into a real commercial product you need much more efforts of not too
creative and interesting character. And to be able to sell it and learn how to get money from that, you
need quite a different talent.

Thank you very much for the conversation and interesting and
informative answers.
Thank you. I was glad to talk to you too.

Conclusion
We would like to thank Konstantin for the interview once again and ask for permission to place this
material in the Internet. We find many of his pieces of advice very useful and will certainly try to fulfill
them in our static analysis program products.

References
1. Konstantin Knyzhnik's homepage. http://www.viva64.com/go.php?url=146
2. JLint description. http://www.viva64.com/go.php?url=147
3. OOO "Program Verification Systems" site. http://www.viva64.com
4. Konstantin Knyzhnik. Creation of multithread applications in Java (RU).
http://www.viva64.com/go.php?url=148
5. Askar Rahimberdiev, Konstantin Knyzhnik, Igor Abramov. The static analyzer of errors in java-
programs (RU). http://www.viva64.com/go.php?url=149
6. I.V. Abramov, S.E. Gorelkin, E.A. Gorelkina, K.A. Knyzhnik, A.M. Rahimberdiev. Experience of
developing a static analyzer for searching errors in Java-programs. // Informational technologies
and programming: Intercollege article collection. Issue 2 (7) M.: MSIU, 2003. 62 pp.
7. Knizhnik, Konstantin. "Reflection for C++." The Garret Group. 4 Nov. 2005.
http://www.viva64.com/go.php?url=150
8. Andrey Karpov, Evgeniy Ryzhkov. 20 issues of porting C++ code on the 64-bit platform.
http://www.viva64.com/art-1-2-599168895.html
9. Alexey Kolosov, Andrey Karpov, Evgeniy Ryzhkov. 32 OpenMP traps for C++ developers.
http://www.viva64.com/art-3-2-1023467288.html

Konstantin Knizhnik: static analysis, a view from aside

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Konstantin Knizhnik: static analysis, a view from aside

Similar to Konstantin Knizhnik: static analysis, a view from aside (20)

Recently uploaded

Recently uploaded (20)

Konstantin Knizhnik: static analysis, a view from aside