xref: /illumos-gate/usr/src/tools/smatch/src/README (revision c85f09cc)
1For parsing implicit dependencies, see smatch_scripts/implicit_dependencies.
2=======
3  sparse (spärs), adj,., spars-er, spars-est.
4	1. thinly scattered or distributed; "a sparse population"
5	2. thin; not thick or dense: "sparse hair"
6	3. scanty; meager.
7	4. semantic parse
8  	[ from Latin: spars(us) scattered, past participle of
9	  spargere 'to sparge' ]
10
11	Antonym: abundant
12
13Sparse is a semantic parser of source files: it's neither a compiler
14(although it could be used as a front-end for one) nor is it a
15preprocessor (although it contains as a part of it a preprocessing
16phase).
17
18It is meant to be a small - and simple - library.  Scanty and meager,
19and partly because of that easy to use.  It has one mission in life:
20create a semantic parse tree for some arbitrary user for further
21analysis.  It's not a tokenizer, nor is it some generic context-free
22parser.  In fact, context (semantics) is what it's all about - figuring
23out not just what the grouping of tokens are, but what the _types_ are
24that the grouping implies.
25
26And no, it doesn't use lex and yacc (or flex and bison).  In my personal
27opinion, the result of using lex/yacc tends to end up just having to
28fight the assumptions the tools make.
29
30The parsing is done in five phases:
31
32 - full-file tokenization
33 - pre-processing (which can cause another tokenization phase of another
34   file)
35 - semantic parsing.
36 - lazy type evaluation
37 - inline function expansion and tree simplification
38
39Note the "full file" part. Partly for efficiency, but mostly for ease of
40use, there are no "partial results". The library completely parses one
41whole source file, and builds up the _complete_ parse tree in memory.
42
43Also note the "lazy" in the type evaluation.  The semantic parsing
44itself will know which symbols are typedefines (required for parsing C
45correctly), but it will not have calculated what the details of the
46different types are.  That will be done only on demand, as the back-end
47requires the information.
48
49This means that a user of the library will literally just need to do
50
51  struct string_list *filelist = NULL;
52  char *file;
53
54  action(sparse_initialize(argc, argv, filelist));
55
56  FOR_EACH_PTR(filelist, file) {
57    action(sparse(file));
58  } END_FOR_EACH_PTR(file);
59
60and he is now done - having a full C parse of the file he opened.  The
61library doesn't need any more setup, and once done does not impose any
62more requirements.  The user is free to do whatever he wants with the
63parse tree that got built up, and needs not worry about the library ever
64again.  There is no extra state, there are no parser callbacks, there is
65only the parse tree that is described by the header files. The action
66funtion takes a pointer to a symbol_list and does whatever it likes with it.
67
68The library also contains (as an example user) a few clients that do the
69preprocessing, parsing and type evaluation and just print out the
70results.  These clients were done to verify and debug the library, and
71also as trivial examples of what you can do with the parse tree once it
72is formed, so that users can see how the tree is organized.
73