Amazon Cloudwatch Logs Insights with JSON fields
JsonAmazon CloudwatchAmazon CloudwatchlogsJson Problem Overview
I am trying to use Logs Insights with data containing JSON in one of the fields, and to parse the JSON fields
My data looks like the following when I put it in insights with the starter code
fields @timestamp, @message
| sort @timestamp desc
| limit 25
How can I easily extract the path
variable in my nested JSON to perform aggregations on it ? By looking at some documentation, I thought @message.path
would work but it does not seem so. Has anyone successfully interpreted JSON logs in Insights
EDIT : Sample of what my data looks like
#
@timestamp
@message
1
2018-12-19 23:42:52.000
I, [2018-12-19T23:42:52.629855 #23447] INFO -- : [2ce588f1-c27d-4a55-ac05-62a75b39e762] {"method":"GET","path":"/api/v1/professionals/ID","format":"json","controller":"API::V1::Public::ProfessionalsController","action":"show","status":200,"duration":285.27,"view":222.36,"time":"2018-12-19T23:42:52.344+00:00","params":{"include":"user,tags,promotions,company_sector,similar_professionals.tags,similar_professionals.user","format":"json","compress":false,"id":"ID"},"@timestamp":"2018-12-19T23:42:52.629Z","@version":"1","message":"[200] GET /api/v1/professionals/ID (API::V1::Public::ProfessionalsController#show)"}
@logStream i-05d1d61ab853517a0
@message I, [2018-12-19T23:42:52.629855 #23447] INFO -- : [2ce588f1-c27d-4a55-ac05-62a75b39e762] {"method":"GET","path":"/api/v1/professionals/ID","format":"json","controller":"API::V1::Public::ProfessionalsController","action":"show","status":200,"duration":285.27,"view":222.36,"time":"2018-12-19T23:42:52.344+00:00","params":{"include":"xxx","format":"json","compress":false,"id":"ID"},"@timestamp":"2018-12-19T23:42:52.629Z","@version":"1","message":"[200] GET /api/v1/professionals/ID (API::V1::Public::ProfessionalsController#show)"}
@timestamp 1545262972000
2
2018-12-19 23:42:16.000
I, [2018-12-19T23:42:16.723472 #851] INFO -- : [ea712503-eb86-4a6e-ab38-ddbcd6c2b4d0] {"method":"GET","path":"/api/v1/heartbeats/new","format":"json","controller":"API::V1::Public::HeartbeatsController","action":"new","status":201,"duration":9.97,"view":3.2,"time":"2018-12-19T23:42:16.712+00:00","params":{"format":"json","compress":false},"@timestamp":"2018-12-19T23:42:16.722Z","@version":"1","message":"[201] GET /api/v1/heartbeats/new (API::V1::Public::HeartbeatsController#new)"}
Json Solutions
Solution 1 - Json
CloudWatch Insights Logs automatically discovers fields for the following log types:
> ### Lambda logs > > CloudWatch Logs Insights automatically discovers log fields in Lambda logs, but only for the first embedded JSON fragment in each log event (Note: emphasis mine). If a Lambda log event contains multiple JSON fragments, you can parse and extract the log fields by using the parse command. For more information, see Fields in JSON Logs. > > ### CloudTrail logs > > See fields in JSON logs.
Source: Supported Logs and Discovered Fields
If @message
is I, [2018-12-11T13:20:27] INFO -- : {"method":"GET"}
Then you can select and filter the fields like so:
fields @timestamp, @message, method
| filter method = "GET"
| sort @timestamp desc
It works with nested fields too, i.e. params.format = "json"
or results.0.firstName = "Paul"
.
Solution 2 - Json
You can use the parse
command to extract fields.
If @message
is
I, [2018-12-11T13:20:27] INFO -- : {"method":"GET"}
Then you extract the fields like so:
fields @timestamp, @message
| parse "I, [*T*] INFO -- : {"method":"*"}" as @date, @time, @method
| filter method=GET
| sort @timestamp desc
| limit 20
The documentation is rather light for now. I am able to get results by replacing the wildcard *
with a regular expression, but then the parsing fails.
Solution 3 - Json
Building on @pyb insights, I was able to use parse @message '"path":"*"' as path
to extract the path from any place in the @message
.
You can go on to get your method by piping another parse @message '"method":"*"' as method
without concern for ordering as it is a second global plain text search on @message
In the case where your @message
is:
I, [2018-12-19T23:42:52.629855 #23447] INFO -- : [2ce588f1-c27d-4a55-ac05-62a75b39e762] {"method":"GET","path":"/api/v1/professionals/ID","format":"json","controller":"API::V1::Public::ProfessionalsController","action":"show","status":200,"duration":285.27,"view":222.36,"time":"2018-12-19T23:42:52.344+00:00","params":{"include":"xxx","format":"json","compress":false,"id":"ID"},"@timestamp":"2018-12-19T23:42:52.629Z","@version":"1","message":"[200] GET /api/v1/professionals/ID (API::V1::Public::ProfessionalsController#show)"}
Using:
parse @message '"path":"*"' as path | parse @message '"method":"*"' as method
will result in the fields: path = '/api/v1/professionals/ID'
and method = 'GET'
Note that this is still simply string parsing and as such, it has no concept of nested keys like params.format
would not find json
, however using just format
would, so long as there wasn't another format
string anywhere in your @message
.
Also note that this is for the case where Insights is not discovering your JSON in the message. I belive this is the case that @pyb was referring to in this answer. My logs aren't being discovered either using the following format
info - Request: {"method":"POST","path":"/auth/login/","body":{"login":{"email":"[email protected]","password":"********"}},"uuid":"36d76df2-aec4-4549-8b73-f237e8f14e23","ip":"*.*.*.*"}
Solution 4 - Json
Also another parse with help of regex
Assume your @message
is:
I, [2018-12-19T23:42:52.629855 #23447] INFO -- : [2ce588f1-c27d-4a55-ac05-62a75b39e762] {"method":"GET","path":"/api/v1/professionals/"}
You can extract method by
fields @timestamp, @message
| parse @message /\"method\":\"(?<method_type>.*?)\"/
Solution 5 - Json
Would this regex query within the parse command help you?
filter @message like / \"path\":\"/
| parse @message /(?<@endpt>((\/[a-zA-Z0-9_{}()-?]+){1,}))/
Good luck!