Search Command> stats, eventstats and streamstats | Splunk (2024)

Getting started with stats, eventstats and streamstats

When I first joined Splunk, like many newbies I needed direction on where to start. Someone gave me some excellent advice:

“Learn the stats and eval commands.”

Putting eval aside for another blog post, let’s examine the stats command. It never ceases to amaze me how many Splunkers are stuck in the “super grep” stage. They just use Splunk to search (happily I might add) for keywords and phrases over many sources of machine data. Hopefully this will help advance some folks beyond “super grep” as well as assist those who may be new to Splunk.

When you dive into Splunk’s excellent documentation, you will find that the stats command has a couple of siblings — eventstats and streamstats. In this blog post, I will attempt, by means of a simple web log example, to illustrate how the variations on the stats command work, and how they are different. Stats typically gets a lot of use, but I’ll use it to set the stage for eventstats and streamstats which don’t get as much use. Reference documentation links are included at the end of the post. I will take a very basic, step-by-step approach by going through what is happening with the stats command, and then expand on that example to show how stats differs from eventstats and streamstats. In an effort to keep it simple, I’ll limit the data of interest to five (5) events with the head command. (If you’re cool with stats, scroll on down to eventstats or streamstats.)

As the name implies, stats is for statistics. Per the Splunk documentation:

Description:
Calculate aggregate statistics over the dataset, similar to SQL aggregation. If called without a by clause, one row is produced, which represents the aggregation over the entire incoming result set. If called with a by-clause, one row is produced for each distinct value of the by-clause.

There are also a number of statistical functions at your disposal, avg() , count() , distinct_count() , median() , perc<int>() , stdev() , sum() , sumsq() , etc. just to name a few.

So let’s look at a simple search command that sums up the number of bytes per IP address from some web logs.

To begin, do a simple search of the web logs in Splunk and look at 5 events and the associated byte count related to two ip addresses in the field clientip.

sourcetype=access_combined* | head 5

The fields (and values of those fields) of interest are as follows:

STATS

Splunk users will notice the raw log events in the results area, as well as a number of fields (in addition to bytes and clientip) listed in a column to the left on the screen shot above. Right now we are just interested in the number of bytes per clientip. Using the stats command and the sum function, I can compute the sum of the bytes for each clientip. I’ll also rename the result to be “ASimpleSumOfBytes” so that it stands out. In addition, I’ll make it easy to find alphabetically, I’ll prefix it with an “A”.

sourcetype=access_combined* | head 5 | stats sum(bytes) as ASimpleSumOfBytes by clientip

To understand what happened with the above search take a look at the “search pipeline” section of the Search Manual in the Splunk documentation and pay attention to intermediate tables, as well as the different types of search commands.

Splunk computes the statistics, in this case “sum” and puts them in a table along with the relevant client IP addresses. This is wonderful and easy, but what if one wishes to build on this and is interested in aggregating the original byte count (or any other related field) in a table such as this:

sourcetype=access_combined* | head 5 | stats sum(bytes) as ASimpleSumOfBytes by clientip | table bytes, ASimpleSumOfBytes, clientip

Hmmm. What happened to my bytes field? Also explained in the documentation is the anatomy of a search. With each Splunk command or term, an intermediate table is produced without the user having to issue any command to allocate the tables. If we wish to add any of the original fields (like bytes) or perform additional calculations on original fields, they would have to be placed before the stats command. See what happens in the screen shot above when we try to add the bytes field to the end of the search command string. This will make the use case for eventstats.

EVENTSTATS

Notice that the bytes column is empty above, because once the table is created by the stats command, Splunk now knows nothing about the original bytes field earlier in the pipeline. This is where eventstats can be helpful. The Splunk command, eventstats, computes the requested statistics like stats, but aggregates them to the original raw data as shown below:

sourcetype=access_combined* | head 5 | eventstats sum(bytes) as ASimpleSumOfBytes by clientip

Now just like stats there are two values ( one for each clientip ) for ASimpleSumOfBytes, but they are aggregated to the raw events and can be used for later calculation. Just a note, your raw data is untouched. The aggregation is just a presentation feature that you get with eventstats.

If I want to add the bytes field for each of the event along with the summation and the clientip, I can easily create the table that failed with stats. Note that the sum of all the bytes per clientip is included along side each of the original bytes value. As the following search illustrates:

sourcetype=access_combined* | head 5 |sort _time | eventstats sum(bytes) as ASimpleSumOfBytes by clientip | table bytes, ASimpleSumOfBytes, clientip

STREAMSTATS

Having the statistics aggregated onto the original events is great, but what if one is interested in what is happening in a streaming manner, or as Splunk sees the events in time. Streamstats is your command. To help visualize this, I’m sorting time in ascending order. I’ll use the “_time” internal field, and then try out streamstats:

sourcetype=access_combined* | head 5 | sort _time | streamstats sum(bytes) as ASimpleSumOfBytes by clientip

Like eventstats, streamstats aggregates the statistics to the original data, so all of the original data is accessible for further calculations, should we wish. By including time and the original byte count in the table below, we can better see what is going on with the streamstats command.

sourcetype=access_combined* | head 5 |sort _time | streamstats sum(bytes) as ASimpleSumOfBytes by clientip | table _time, clientip, bytes, ASimpleSumOfBytes

As shown in the screen shot above, instead of a total sum for each clientip (as in stats and eventstats), there is a sum for each event as it is seen in time, each one building on the other. Also note that two of these match the total sum calculated in stats and eventstats for each clientip.

The difference here is that the value of the calculated field “ASimpleSumOfBytes ” varies depending on the time that Splunk sees the event at a specific moment. Where is this helpful? As it turns out, lots of questions that folks have about their data concerns what is going on at a specific moment or range in time. Streamstats is extremely useful for this kind of searching and reporting.

Below I have included some links to the Splunk Documentation and Answers communities. Check them out. I hope this has been helpful.

Happy Splunking

References:

About the Splunk search language
http://docs.splunk.com/Documentation/Splunk/latest/Search/Aboutthesearchlanguage

Anatomy of a Splunk search
http://docs.splunk.com/Documentation/Splunk/latest/Search/Aboutthesearchpipeline#The_anatomy_of_a_search

Answers post to help understand visualizing time in Splunk, related to streamstats
http://answers.splunk.com/answers/105733/streamstats-is-reversed

Splunker finds a cool use for streamstats
http://blogs.splunk.com/2013/10/31/streamstats-example

The stats page in the Splunk docs
http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Stats

Functions that work with Stats
http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/CommonStatsFunctions

Search Command&gt; stats, eventstats and streamstats | Splunk (9)

Splunk

The world’s leading organizations trustSplunkto help keep their digital systems secure and reliable. Our software solutions and services help to prevent major issues, absorb shocks and accelerate transformation. Learnwhat Splunk doesandwhy customers choose Splunk.

Search Command&gt; stats, eventstats and streamstats | Splunk (2024)

FAQs

What is the difference between stats eventstats and streamstats? ›

eventstats adds the desired stats function result to the event, derived from the entire set of events. Streamstats adds the desired stats function result to the event, derived from the point in time of the current event in the stream. An example is a moving average.

What is the streamstats command in Splunk? ›

The SPL2 streamstats command adds a cumulative statistical value to each search result as each result is processed. For example, you can calculate the running total for a particular field, or compare a value in a search result with a the cumulative value, such as a running average.

What is the eventstats command in Splunk? ›

The eventstats command

I like to think of eventstats as a method to calculate “grand totals” within a result set that can then be used to manipulate these totals to introspect the data set further.

What is StreamStats used for? ›

StreamStats provides estimates of various streamflow statistics for user-selected sites by solving equations that were developed through a process known as regionalization.

How do StreamStats estimate flood frequencies? ›

After StreamStats measures the drainage-basin characteristics for a selected site, the values are sent to the National Streamflow Statistics (NSS) Services, which contain all of the USGS-developed equations for estimating flood-frequency statistics in the Nation.

What is the StreamStats command? ›

The 'streamstats' command is another statistical command in Splunk that is used to perform real-time statistical analysis on event streams.

What is the use of stats command in Splunk? ›

The SPL2 stats command calculates aggregate statistics, such as average, count, and sum, over the incoming search results set. This is similar to SQL aggregation.

What are streaming commands in Splunk? ›

A streaming command applies a transformation to each event returned by a search. For example, the rex command is streaming because it extracts and adds fields to events at search time.

What is the limit of eventstats? ›

By default, eventstats can aggregate up to 50,000 events at a time. You can change this limit with the MaxNoOfa*ggregatedEvents parameter.

What is the 50000 limit in Splunk stats? ›

The limit you're talking about is the one where, if your base search is just returning raw event rows, Splunk only keeps 50,000 events in the search result. This means that later when you run your postprocess there can be misleading results.

What is the difference between stats and transaction commands in Splunk? ›

Stats provides the aggregation. transaction provides the unique number / count. Like you perform 10 steps as part of one transaction.

What is the difference between stats and statsmodels? ›

stats has all of the probability distributions and some statistical tests. It's more like library code in the vein of numpy and scipy. Statsmodels on the other hand provides statistical models with a formula framework similar to R and it works with pandas DataFrames.

What is the difference between stats Ttest_ind and stats Ttest_rel? ›

ttest_rel is for related samples and stats. ttest_ind is for independent samples. Since you have two different values for the size they are two independent samples. They may be coming from the same population but they are different so you need to use stats.

What is the difference between statistics Stdev and statistics Pstdev? ›

stdev() function only calculates standard deviation from a sample of data, rather than an entire population. To calculate standard deviation of an entire population, another function known as pstdev() is used. Standard Deviation is a measure of spread in Statistics.

What is the difference between statistics and official statistics? ›

Official statistics are statistics produced within a national statistical system. National statistical systems include statistical organisations and units within a country that jointly collect, process and disseminate official statistics on behalf of the national government.

References

Top Articles
Paris in March: Weather, Festivals, & Things to Do (2024) - Roaming Paris
Paris in March 2024: What to Do, How to Pack & More
排期 一亩三分地
Thedirtyship
M3Gan Showtimes Near Cinemark Movies 8 - Paris
Qdoba Calorie Calc
Phun.celeb
Which is better, bonds or treasury bills?
Academic Calendar Biola
iPad 10 vs. iPad Air Buyer's Guide: Is the $250 Difference Worth It?
Raymond James Stadium Seat Map Taylor Swift
Mandy Sacs On BLP Combine And The Vince McMahon Netflix Documentary
Urology Match Spreadsheet
Valentina Gonzalez Leak
UHD-4K-Monitor mit 27 Zoll und VESA DisplayHDR™ 400 - 27UQ750-W | LG DE
Summoner Calamity
John W Creasy Died December 16 2003
New & Used Motorcycles for Sale | NL Classifieds
Which Statement About These Two Restaurant Meals Is Correct
Paperless Pay.talx/Nestle
Ubreakifix Laptop Repair
Uhcs Patient Wallet
Cheap Motorcycles For Sale Under 1000 Craigslist Near Me
Icl Urban Dictionary
Ticket To Paradise Showtimes Near Movie Tavern Bedford
The Blind Showtimes Near Showcase Cinemas Springdale
Jetblue Live Flight Tracker
Tamilblasters Movie Download Isaimini
Forest Haven Asylum Stabbing 2017
G Data IS lastet 16 GB RAM vollständig aus
Insidekp.kp.org Myhr Portal
Proctor Funeral Home Obituaries Beaumont Texas
Mtvkay21
Perfect Coffee Shop Recipe Cool Math Games
Craigslist Columbia Sc Com
Our Favorite Paper Towel Holders for Everyday Tasks
Apex Item Store.com
Corinne Massiah Bikini
Accident On 215
Cititrends Workday Login
Briggs And Stratton 125Cc Lawn Mower
Craigslist Nj Apartments South Jersey
A Man Called Otto Showtimes Near Carolina Mall Cinema
Borderlands 2 Mechromancer Leveling Build
American Idol Winners Wiki
Deml Ford Used Cars
How to Set Up Dual Carburetor Linkage (with Images)
2024 USAF & USSF Almanac: DAF Personnel | Air & Space Forces Magazine
Fapspace.site
Saratoga Otb Results
Backrooms Level 478
Mangadex.oeg
Latest Posts
Article information

Author: Nicola Considine CPA

Last Updated:

Views: 6494

Rating: 4.9 / 5 (49 voted)

Reviews: 80% of readers found this page helpful

Author information

Name: Nicola Considine CPA

Birthday: 1993-02-26

Address: 3809 Clinton Inlet, East Aleisha, UT 46318-2392

Phone: +2681424145499

Job: Government Technician

Hobby: Calligraphy, Lego building, Worldbuilding, Shooting, Bird watching, Shopping, Cooking

Introduction: My name is Nicola Considine CPA, I am a determined, witty, powerful, brainy, open, smiling, proud person who loves writing and wants to share my knowledge and understanding with you.