samedi 29 novembre 2014

Extracting trace information using gawk


I am new to awk programming and am using gawk to extract trace information from a mrt file in order to use it further for analysis. I have successfully been able to extract trace information from a pcap file format but cannot figure it out for the mrt format. Let me explain what I am trying to extract first by showing you the exampleof pcap format.


My pcap input file is:



No. Time Source Destination Protocol Length User Datagram Protocol Info
1 0.000000 2001:4958:10:2::2 2001:4958:10:2::3 BGP 143 UPDATE Message
Frame 1: 143 bytes on wire (1144 bits), 143 bytes captured (1144 bits)
Ethernet II, Src: JuniperN_36:98:52 (5c:5e:ab:36:98:52), Dst: JuniperN_3e:bf:49 (78:19:f7:3e:bf:49)
Internet Protocol Version 6, Src: 2001:4958:10:2::2 (2001:4958:10:2::2), Dst: 2001:4958:10:2::3 (2001:4958:10:2::3)
Transmission Control Protocol, Src Port: bgp (179), Dst Port: 56797 (56797), Seq: 1, Ack: 1, Len: 37
Border Gateway Protocol
No. Time Source Destination Protocol Length User Datagram Protocol Info
2 0.326625 2001:4958:10:2::2 2001:4958:10:2::3 BGP 184 UPDATE Message
Frame 2: 184 bytes on wire (1472 bits), 184 bytes captured (1472 bits)
Ethernet II, Src: JuniperN_36:98:52 (5c:5e:ab:36:98:52), Dst: JuniperN_3e:bf:49 (78:19:f7:3e:bf:49)
Internet Protocol Version 6, Src: 2001:4958:10:2::2 (2001:4958:10:2::2), Dst: 2001:4958:10:2::3 (2001:4958:10:2::3)
Transmission Control Protocol, Src Port: bgp (179), Dst Port: 56797 (56797), Seq: 38, Ack: 1, Len: 78
Border Gateway Protocol
No. Time Source Destination Protocol Length User Datagram Protocol Info
3 1.178114 2001:4958:10:2::2 2001:4958:10:2::3 TCP 106 bgp > 56797 [ACK] Seq=116 Ack=20 Win=16384 Len=0 TSval=3269200636 TSecr=371929488
Frame 3: 106 bytes on wire (848 bits), 106 bytes captured (848 bits)
Ethernet II, Src: JuniperN_36:98:52 (5c:5e:ab:36:98:52), Dst: JuniperN_3e:bf:49 (78:19:f7:3e:bf:49)
Internet Protocol Version 6, Src: 2001:4958:10:2::2 (2001:4958:10:2::2), Dst: 2001:4958:10:2::3 (2001:4958:10:2::3)
Transmission Control Protocol, Src Port: bgp (179), Dst Port: 56797 (56797), Seq: 116, Ack: 20, Len: 0
No. Time Source Destination Protocol Length User Datagram Protocol Info
4 2.410144 64.251.87.209 64.251.87.210 BGP 228 UPDATE Message, UPDATE Message
Frame 4: 228 bytes on wire (1824 bits), 228 bytes captured (1824 bits)
Ethernet II, Src: Cisco_e7:a1:c0 (00:1b:0d:e7:a1:c0), Dst: JuniperN_3e:ba:bd (78:19:f7:3e:ba:bd)
Internet Protocol Version 4, Src: 64.251.87.209 (64.251.87.209), Dst: 64.251.87.210 (64.251.87.210)
Transmission Control Protocol, Src Port: bgp (179), Dst Port: 65502 (65502), Seq: 1, Ack: 1, Len: 154
Border Gateway Protocol
Border Gateway Protocol
No. Time Source Destination Protocol Length User Datagram Protocol Info
5 3.467853 206.47.102.206 206.47.102.201 BGP 105 KEEPALIVE Message
Frame 5: 105 bytes on wire (840 bits), 105 bytes captured (840 bits)
Ethernet II, Src: JuniperN_36:98:52 (5c:5e:ab:36:98:52), Dst: JuniperN_3e:bf:49 (78:19:f7:3e:bf:49)
Internet Protocol Version 4, Src: 206.47.102.206 (206.47.102.206), Dst: 206.47.102.201 (206.47.102.201)
Transmission Control Protocol, Src Port: bgp (179), Dst Port: 55700 (55700), Seq: 1, Ack: 1, Len: 19
Border Gateway Protocol


I wanted to extract the following fields from the file:



  • Time

  • Source

  • Destination

  • Protocol

  • Length User Datagram


I created a simple script:



{

if($1 ~ /[0-9]/)
{
print $2,$3,$4,$5,$6
}
}


And ran it on the input trace to receive (gawk -f script.txt input-pcap.txt >> pcap-out.txt) as shown below:



0.000000 2001:4958:10:2::2 2001:4958:10:2::3 BGP 143
0.326625 2001:4958:10:2::2 2001:4958:10:2::3 BGP 184
1.178114 2001:4958:10:2::2 2001:4958:10:2::3 TCP 106
2.410144 64.251.87.209 64.251.87.210 BGP 228
3.467853 206.47.102.206 206.47.102.201 BGP 105


Now, I want to do the same for mrt input format. The input file looks like:



TIME: 11/01/07 00:11:09TYPE: TABLE_DUMP/INETVIEW: 0SEQUENCE: 0PREFIX:
0.0.0.0/0FROM:96.4.0.55 AS11686ORIGINATED: 10/24/07 06:26:23ORIGIN: IGPASPATH: 11686
3561NEXT_HOP: 96.4.0.55STATUS: 0x1TIME: 11/01/07 00:11:09TYPE: TABLE_DUMP/INETVIEW:
0SEQUENCE: 1PREFIX: 0.0.0.0/0FROM:213.140.32.148 AS12956ORIGINATED: 10/24/07
06:26:16ORIGIN: IGPASPATH: 12956NEXT_HOP: 213.140.32.148STATUS: 0x1TIME: 11/01/07
00:11:09TYPE: TABLE_DUMP/INETVIEW: 0SEQUENCE: 2PREFIX: 3.0.0.0/8
FROM:207.45.223.244 AS6453ORIGINATED: 10/31/07 07:37:39ORIGIN: IGPASPATH: 6453 701 703
80NEXT_HOP: 207.45.223.244STATUS: 0x1TIME: 11/01/07 00:11:09TYPE: TABLE_DUMP/INETVIEW:
0SEQUENCE: 3PREFIX: 3.0.0.0/8FROM:195.219.96.239 AS6453ORIGINATED: 10/31/07
07:49:07ORIGIN: IGPASPATH: 6453 701 703 80NEXT_HOP: 195.219.96.239STATUS: 0x1TIME:
11/01/07 00:11:09TYPE: TABLE_DUMP/INETVIEW: 0SEQUENCE: 4
PREFIX: 3.0.0.0/8FROM:129.250.0.11 AS2914ORIGINATED: 10/31/07 06:09:07ORIGIN:
IGPASPATH: 2914 701 703 80NEXT_HOP: 129.250.0.11MULTI_EXIT_DISC: 6COMMUNITY: 2914:420
2914:2000 2914:3000 65504:70STATUS: 0x1


I wish to extract the following information in the same style as done in pcap format so that it can be read by the analysis software:



  • Time (difference from first packet time field in seconds)

  • Source (6th field, only the IP address is needed)

  • Destination (10th field)

  • Protocol/Origin (8th field)


I tried a couple of scripts but haven't been successful. Please advise. Thanks!



Aucun commentaire:

Enregistrer un commentaire