pageview - tracks visited HTML pages
A record stream originating from the Extract HTTP tool.
A record set. Optional.
The original record is output with four fields appended.
This tool determines whether a hit is to be considered a page impression.
It is possible to use an exclude-set with URLs that must not be considered a page. The tool will then set is_page to 0 for records matching this urls in the exclude-set.
Updates done to the exclude-set might not take effect immediately since the tool maintains a cache that is updated periodically. However, the effect should be visible within approx. 60 seconds.
The fields appended to the output are these:
| Field Name | Value |
| is_page | 1 if the hit is determined to be a page impression. 0 otherwise. |
| parent_page | The url of the page from where this hit is derived. |
| sequence | A sequence number used to keep track on a unique clients behavior. Incremented for each page impression generated by a unique client. |
| confidence | The value of this field indicates the likelihood of the hit being a page impression. The range is 0-1. The closer to 1 it gets the better the chances are that the hit is a page impression. |
The field used to uniquely identify a client.
Max. allowed cache size (1-15 Mb).
With this option it is possible to lookup in a record set to determine whether a given url is to be excluded as a possible page impression. Typically the record set could be a table read by the Lookup/ODBC tool.
The field of the exclude-set to match URLs against. The values of this field must be full URLs.