Skip to content

lkl: Add FreeBSD platform support#1

Closed
cemeyer wants to merge 22 commits intolkl:masterfrom
cemeyer:cem/xbuild_fbsd
Closed

lkl: Add FreeBSD platform support#1
cemeyer wants to merge 22 commits intolkl:masterfrom
cemeyer:cem/xbuild_fbsd

Conversation

@cemeyer
Copy link

@cemeyer cemeyer commented Nov 9, 2015

Please don't pull this as-is! In particular, the first few commits are big hacks that worked around issues only present on FreeBSD.

For example, without "Hack NR_CPUS" I got some warning about an invalid range; "timeconst.bc" hacks around limitations in FreeBSD's bc(1) program by hardcoding 250 hz in the bc script; and "Build 64-bit" hardcodes CONFIG_64BIT (without which it seemed to try and build 32-bit on a 64-bit host).

Despite those kludges, there are some decent portability-improving patches in here that won't hurt Linux or NT support. And some fixes for compiler or other issues (e.g. statfs64 was missing a 3rd argument which resulted in -EINVAL on all calls).

With this patchset I have ext4 mounting r/w on FreeBSD w/ lklfuse. Neat!

@tavip
Copy link
Member

tavip commented Nov 10, 2015

Awesome ! I will test and cherry pick most of the patches and rework some - "Build 64bit" can use the OUTPUT_FORMAT to detect that host is 64bit and enable 64bit build. Thanks !

@cemeyer
Copy link
Author

cemeyer commented Nov 10, 2015

Thanks!

@tavip
Copy link
Member

tavip commented Nov 11, 2015

Hi Conrad,

I have created a new branch (lkl_fbsd) which has your fixes. I've added your Signed-off-by: to your commits to adhere with the Linux process, hope that is ok.

I've also squashed a few commits and reworked some. Could you take a look and give it a try? Hope I didn't broke anything :)

Thanks,
Tavi

@cemeyer
Copy link
Author

cemeyer commented Nov 11, 2015

I've added your Signed-off-by: to your commits to adhere with the Linux process, hope that is ok.

Fine with me. (Edit: if it's not too difficult, can the Signed-off-by address be changed to cem@FreeBSD.org?)

I've also squashed a few commits and reworked some. Could you take a look and give it a try? Hope I didn't broke anything :)

I'll take a look, thanks.

Edit:

Re: "lkl: convert makefile echo \t to inline tab", it worked on FreeBSD :). If the inline tab works everywhere that's even better.

Re: "error: ignoring return value of ‘write’, declared with attribute warn_unused_result [-Werror=unused-result]" (both of them) — I believe you can just cast the result to void.

(void)write(STDOUT_FILENO, str, len);

Instead of the unused stack variable.

Re: "lkl: add support for 64bit FreeBSD", excellent! It's way better than my hack, thanks!

The patches look good to me. I will try them out shortly (I suspect the bc script will still fail on FreeBSD, unfortunately). If you haven't already, I have some follow-up patches for lklfuse you could look at in https://github.com/cemeyer/linux/tree/cem/lklfuse . I don't have XFS or BTRFS actually mounting yet; currently the kernel seems to get wedged at

[    0.004000] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.004000] xor: measuring software checksum speed

@cemeyer
Copy link
Author

cemeyer commented Nov 11, 2015

A test build on lkl_fbsd branch:

...
  CHK     include/generated/timeconst.h
dc: not a string
  UPD     include/generated/timeconst.h
...
In file included from include/linux/jiffies.h:10:0,
                 from include/linux/ktime.h:25,
                 from include/linux/rcupdate.h:47,
                 from include/linux/srcu.h:33,
                 from include/linux/notifier.h:15,
                 from include/linux/memory_hotplug.h:6,
                 from include/linux/mmzone.h:812,
                 from include/linux/gfp.h:5,
                 from include/linux/kmod.h:22,
                 from include/linux/module.h:13,
                 from init/main.c:15:
include/generated/timeconst.h:11:2: error: #error "include/generated/timeconst.h has the wrong HZ value!"
 #error "include/generated/timeconst.h has the wrong HZ value!"
  ^
include/generated/timeconst.h:14:2: error: #error Totally bogus HZ value!
 #error Totally bogus HZ value!
  ^
include/generated/timeconst.h:15:1: error: expected identifier or '(' before numeric constant
 0
 ^
...

Yup, the bc script is still a problem. I think it may be easiest to port GNU bc to FreeBSD — the FreeBSD native bc is too limited for the needs of this script.

@cemeyer
Copy link
Author

cemeyer commented Nov 11, 2015

With just the BC hack from before and a second patch (-largp for fs2tar got dropped from cemeyer@8c2495a6), lkl_fbsd builds again: https://github.com/cemeyer/linux/tree/lkl_fbsd_plus Edit: And I can mount a USB ext4 under FUSE mostly fine. O_CREAT is broken without either handling it in open(2)or adding the FUSE create() method (I added it in my cem/lklfuse branch: cemeyer@ef78131).

@tavip
Copy link
Member

tavip commented Nov 11, 2015

Fine with me. (Edit: if it's not too difficult, can the Signed-off-by address be changed to >cem@FreeBSD.org?)

Yep, I'll do that. I guess I should also change the Author address, right?

Re: "error: ignoring return value of ‘write’, declared with attribute warn_unused_result [-Werror=unused->result]" (both of them) — I believe you can just cast the result to void.

With the new compilers (at least on Ubuntu 14.04) the (void) cast trick does not work anymore apparently due to better optimizations :)

-largp for fs2tar got dropped from cemeyer/linux@8c2495a),

Oops, I'll fix that.

I suspect the bc script will still fail on FreeBSD, unfortunately

Not sure how to deal with this. Can we keep it separate for now until maybe we find a better solution?

I'll rebase the patches to address the above and force push to the same branch and way you say ok I will merge it to the main branch.

@cemeyer
Copy link
Author

cemeyer commented Nov 11, 2015

Yep, I'll do that. I guess I should also change the Author address, right?

Sure.

With the new compilers (at least on Ubuntu 14.04) the (void) cast trick does not work anymore apparently due to better optimizations :)

I think it is just GCC being very pedantic about warn_unused_result.

I suspect the bc script will still fail on FreeBSD, unfortunately

Not sure how to deal with this. Can we keep it separate for now until maybe we find a better solution?

Yes. I think I'll try and import GNU bc into FreeBSD ports. After that, we may want a patch that changes bc invocations to $(BC) and sets BC to bc on Linux and gbc on FreeBSD. I think that's a straightforward way of getting where we want.

I'll rebase the patches to address the above and force push to the same branch and way you say ok I will merge it to the main branch.

Great, thanks!

phhusson referenced this pull request in phhusson/linux Nov 13, 2016
The data_mutex is initialized too late, as it is needed for
each device driver's power control, causing an OOPS:

    dvb-usb: found a 'TerraTec/qanu USB2.0 Highspeed DVB-T Receiver' in warm state.
    BUG: unable to handle kernel NULL pointer dereference at           (null)
    IP: [<ffffffff846617af>] __mutex_lock_slowpath+0x6f/0x100 PGD 0
    Oops: 0002 [#1] SMP
    Modules linked in: dvb_usb_cinergyT2(+) dvb_usb
    CPU: 0 PID: 2029 Comm: modprobe Not tainted 4.9.0-rc4-dvbmod lkl#24
    Hardware name: FUJITSU LIFEBOOK A544/FJNBB35 , BIOS Version 1.17 05/09/2014
    task: ffff88020e943840 task.stack: ffff8801f36ec000
    RIP: 0010:[<ffffffff846617af>]  [<ffffffff846617af>] __mutex_lock_slowpath+0x6f/0x100
    RSP: 0018:ffff8801f36efb10  EFLAGS: 00010282
    RAX: 0000000000000000 RBX: ffff88021509bdc8 RCX: 00000000c0000100
    RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff88021509bdcc
    RBP: ffff8801f36efb58 R08: ffff88021f216320 R09: 0000000000100000
    R10: ffff88021f216320 R11: 00000023fee6c5a1 R12: ffff88020e943840
    R13: ffff88021509bdcc R14: 00000000ffffffff R15: ffff88021509bdd0
    FS:  00007f21adb86740(0000) GS:ffff88021f200000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000000 CR3: 0000000215bce000 CR4: 00000000001406f0
    Call Trace:
       mutex_lock+0x16/0x25
       cinergyt2_power_ctrl+0x1f/0x60 [dvb_usb_cinergyT2]
       dvb_usb_device_init+0x21e/0x5d0 [dvb_usb]
       cinergyt2_usb_probe+0x21/0x50 [dvb_usb_cinergyT2]
       usb_probe_interface+0xf3/0x2a0
       driver_probe_device+0x208/0x2b0
       __driver_attach+0x87/0x90
       driver_probe_device+0x2b0/0x2b0
       bus_for_each_dev+0x52/0x80
       bus_add_driver+0x1a3/0x220
       driver_register+0x56/0xd0
       usb_register_driver+0x77/0x130
       do_one_initcall+0x46/0x180
       free_vmap_area_noflush+0x38/0x70
       kmem_cache_alloc+0x84/0xc0
       do_init_module+0x50/0x1be
       load_module+0x1d8b/0x2100
       find_symbol_in_section+0xa0/0xa0
       SyS_finit_module+0x89/0x90
       entry_SYSCALL_64_fastpath+0x13/0x94
    Code: e8 a7 1d 00 00 8b 03 83 f8 01 0f 84 97 00 00 00 48 8b 43 10 4c 8d 7b 08 48 89 63 10 4c 89 3c 24 41 be ff ff ff ff 48 89 44 24 08 <48> 89 20 4c 89 64 24 10 eb 1a 49 c7 44 24 08 02 00 00 00 c6 43 RIP  [<ffffffff846617af>] __mutex_lock_slowpath+0x6f/0x100 RSP <ffff8801f36efb10>
    CR2: 0000000000000000

So, move it to the struct dvb_usb_device and initialize it
before calling the driver's callbacks.

Reported-by: Jörg Otte <jrg.otte@gmail.com>
Tested-by: Jörg Otte <jrg.otte@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
phhusson referenced this pull request in phhusson/linux Nov 13, 2016
The DVB binding schema at the DVB core assumes that the frontend is a
separate driver.  Faling to do that causes OOPS when the module is
removed, as it tries to do a symbol_put_addr on an internal symbol,
causing craches like:

    WARNING: CPU: 1 PID: 28102 at kernel/module.c:1108 module_put+0x57/0x70
    Modules linked in: dvb_usb_gp8psk(-) dvb_usb dvb_core nvidia_drm(PO) nvidia_modeset(PO) snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd soundcore nvidia(PO) [last unloaded: rc_core]
    CPU: 1 PID: 28102 Comm: rmmod Tainted: P        WC O 4.8.4-build.1 #1
    Hardware name: MSI MS-7309/MS-7309, BIOS V1.12 02/23/2009
    Call Trace:
       dump_stack+0x44/0x64
       __warn+0xfa/0x120
       module_put+0x57/0x70
       module_put+0x57/0x70
       warn_slowpath_null+0x23/0x30
       module_put+0x57/0x70
       gp8psk_fe_set_frontend+0x460/0x460 [dvb_usb_gp8psk]
       symbol_put_addr+0x27/0x50
       dvb_usb_adapter_frontend_exit+0x3a/0x70 [dvb_usb]

From Derek's tests:
    "Attach bug is fixed, tuning works, module unloads without
     crashing. Everything seems ok!"

Reported-by: Derek <user.vdr@gmail.com>
Tested-by: Derek <user.vdr@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
liuyuan10 pushed a commit to liuyuan10/linux that referenced this pull request Nov 22, 2016
Userspace can begin and suspend a transaction within the signal
handler which means they might enter sys_rt_sigreturn() with the
processor in suspended state.

sys_rt_sigreturn() wants to restore process context (which may have
been in a transaction before signal delivery). To do this it must
restore TM SPRS. To achieve this, any transaction initiated within the
signal frame must be discarded in order to be able to restore TM SPRs
as TM SPRs can only be manipulated non-transactionally..
>From the PowerPC ISA:
  TM Bad Thing Exception [Category: Transactional Memory]
   An attempt is made to execute a mtspr targeting a TM register in
   other than Non-transactional state.

Not doing so results in a TM Bad Thing:
[12045.221359] Kernel BUG at c000000000050a40 [verbose debug info unavailable]
[12045.221470] Unexpected TM Bad Thing exception at c000000000050a40 (msr 0x201033)
[12045.221540] Oops: Unrecoverable exception, sig: 6 [lkl#1]
[12045.221586] SMP NR_CPUS=2048 NUMA PowerNV
[12045.221634] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE
 nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter
 ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables kvm_hv kvm
 uio_pdrv_genirq ipmi_powernv uio powernv_rng ipmi_msghandler autofs4 ses enclosure
 scsi_transport_sas bnx2x ipr mdio libcrc32c
[12045.222167] CPU: 68 PID: 6178 Comm: sigreturnpanic Not tainted 4.7.0 lkl#34
[12045.222224] task: c0000000fce38600 ti: c0000000fceb4000 task.ti: c0000000fceb4000
[12045.222293] NIP: c000000000050a40 LR: c0000000000163bc CTR: 0000000000000000
[12045.222361] REGS: c0000000fceb7ac0 TRAP: 0700   Not tainted (4.7.0)
[12045.222418] MSR: 9000000300201033 <SF,HV,ME,IR,DR,RI,LE,TM[SE]> CR: 28444280  XER: 20000000
[12045.222625] CFAR: c0000000000163b8 SOFTE: 0 PACATMSCRATCH: 900000014280f033
GPR00: 01100000b8000001 c0000000fceb7d40 c00000000139c100 c0000000fce390d0
GPR04: 900000034280f033 0000000000000000 0000000000000000 0000000000000000
GPR08: 0000000000000000 b000000000001033 0000000000000001 0000000000000000
GPR12: 0000000000000000 c000000002926400 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR24: 0000000000000000 00003ffff98cadd0 00003ffff98cb470 0000000000000000
GPR28: 900000034280f033 c0000000fceb7ea0 0000000000000001 c0000000fce390d0
[12045.223535] NIP [c000000000050a40] tm_restore_sprs+0xc/0x1c
[12045.223584] LR [c0000000000163bc] tm_recheckpoint+0x5c/0xa0
[12045.223630] Call Trace:
[12045.223655] [c0000000fceb7d80] [c000000000026e74] sys_rt_sigreturn+0x494/0x6c0
[12045.223738] [c0000000fceb7e30] [c0000000000092e0] system_call+0x38/0x108
[12045.223806] Instruction dump:
[12045.223841] 7c800164 4e800020 7c0022a6 f80304a8 7c0222a6 f80304b0 7c0122a6 f80304b8
[12045.223955] 4e800020 e80304a8 7c0023a e80304b0 <7c0223a6> e80304b8 7c0123a6 4e800020
[12045.224074] ---[ end trace cb8002ee240bae76 ]---

It isn't clear exactly if there is really a use case for userspace
returning with a suspended transaction, however, doing so doesn't (on
its own) constitute a bad frame. As such, this patch simply discards
the transactional state of the context calling the sigreturn and
continues.

Reported-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Tested-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Acked-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
liuyuan10 pushed a commit to liuyuan10/linux that referenced this pull request Nov 22, 2016
I got this:

    divide error: 0000 [lkl#1] PREEMPT SMP KASAN
    CPU: 1 PID: 1327 Comm: a.out Not tainted 4.8.0-rc2+ lkl#189
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
    task: ffff8801120a9580 task.stack: ffff8801120b0000
    RIP: 0010:[<ffffffff82c8bd9a>]  [<ffffffff82c8bd9a>] snd_hrtimer_callback+0x1da/0x3f0
    RSP: 0018:ffff88011aa87da8  EFLAGS: 00010006
    RAX: 0000000000004f76 RBX: ffff880112655e88 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: ffff880112655ea0 RDI: 0000000000000001
    RBP: ffff88011aa87e00 R08: ffff88013fff905c R09: ffff88013fff9048
    R10: ffff88013fff9050 R11: 00000001050a7b8c R12: ffff880114778a00
    R13: ffff880114778ab4 R14: ffff880114778b30 R15: 0000000000000000
    FS:  00007f071647c700(0000) GS:ffff88011aa80000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000603001 CR3: 0000000112021000 CR4: 00000000000006e0
    Stack:
     0000000000000000 ffff880114778ab8 ffff880112655ea0 0000000000004f76
     ffff880112655ec8 ffff880112655e80 ffff880112655e88 ffff88011aa98fc0
     00000000b97ccf2b dffffc0000000000 ffff88011aa98fc0 ffff88011aa87ef0
    Call Trace:
     <IRQ>
     [<ffffffff813abce7>] __hrtimer_run_queues+0x347/0xa00
     [<ffffffff82c8bbc0>] ? snd_hrtimer_close+0x130/0x130
     [<ffffffff813ab9a0>] ? retrigger_next_event+0x1b0/0x1b0
     [<ffffffff813ae1a6>] ? hrtimer_interrupt+0x136/0x4b0
     [<ffffffff813ae220>] hrtimer_interrupt+0x1b0/0x4b0
     [<ffffffff8120f91e>] local_apic_timer_interrupt+0x6e/0xf0
     [<ffffffff81227ad3>] ? kvm_guest_apic_eoi_write+0x13/0xc0
     [<ffffffff83c35086>] smp_apic_timer_interrupt+0x76/0xa0
     [<ffffffff83c3416c>] apic_timer_interrupt+0x8c/0xa0
     <EOI>
     [<ffffffff83c3239c>] ? _raw_spin_unlock_irqrestore+0x2c/0x60
     [<ffffffff82c8185d>] snd_timer_start1+0xdd/0x670
     [<ffffffff82c87015>] snd_timer_continue+0x45/0x80
     [<ffffffff82c88100>] snd_timer_user_ioctl+0x1030/0x2830
     [<ffffffff8159f3a0>] ? __follow_pte.isra.49+0x430/0x430
     [<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80
     [<ffffffff815a26fa>] ? do_wp_page+0x3aa/0x1c90
     [<ffffffff815aa4f8>] ? handle_mm_fault+0xbc8/0x27f0
     [<ffffffff815a9930>] ? __pmd_alloc+0x370/0x370
     [<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80
     [<ffffffff816b0733>] do_vfs_ioctl+0x193/0x1050
     [<ffffffff816b05a0>] ? ioctl_preallocate+0x200/0x200
     [<ffffffff81002f2f>] ? syscall_trace_enter+0x3cf/0xdb0
     [<ffffffff815045ba>] ? __context_tracking_exit.part.4+0x9a/0x1e0
     [<ffffffff81002b60>] ? exit_to_usermode_loop+0x190/0x190
     [<ffffffff82001a97>] ? check_preemption_disabled+0x37/0x1e0
     [<ffffffff81d93889>] ? security_file_ioctl+0x89/0xb0
     [<ffffffff816b167f>] SyS_ioctl+0x8f/0xc0
     [<ffffffff816b15f0>] ? do_vfs_ioctl+0x1050/0x1050
     [<ffffffff81005524>] do_syscall_64+0x1c4/0x4e0
     [<ffffffff83c32b2a>] entry_SYSCALL64_slow_path+0x25/0x25
    Code: e8 fc 42 7b fe 8b 0d 06 8a 50 03 49 0f af cf 48 85 c9 0f 88 7c 01 00 00 48 89 4d a8 e8 e0 42 7b fe 48 8b 45 c0 48 8b 4d a8 48 99 <48> f7 f9 49 01 c7 e8 cb 42 7b fe 48 8b 55 d0 48 b8 00 00 00 00
    RIP  [<ffffffff82c8bd9a>] snd_hrtimer_callback+0x1da/0x3f0
     RSP <ffff88011aa87da8>
    ---[ end trace 6aa380f756a21074 ]---

The problem happens when you call ioctl(SNDRV_TIMER_IOCTL_CONTINUE) on a
completely new/unused timer -- it will have ->sticks == 0, which causes a
divide by 0 in snd_hrtimer_callback().

Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
liuyuan10 pushed a commit to liuyuan10/linux that referenced this pull request Nov 22, 2016
I hit this with syzkaller:

    kasan: CONFIG_KASAN_INLINE enabled
    kasan: GPF could be caused by NULL-ptr deref or user memory access
    general protection fault: 0000 [lkl#1] PREEMPT SMP KASAN
    CPU: 0 PID: 1327 Comm: a.out Not tainted 4.8.0-rc2+ lkl#190
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
    task: ffff88011278d600 task.stack: ffff8801120c0000
    RIP: 0010:[<ffffffff82c8ba07>]  [<ffffffff82c8ba07>] snd_hrtimer_start+0x77/0x100
    RSP: 0018:ffff8801120c7a60  EFLAGS: 00010006
    RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000007
    RDX: 0000000000000009 RSI: 1ffff10023483091 RDI: 0000000000000048
    RBP: ffff8801120c7a78 R08: ffff88011a5cf768 R09: ffff88011a5ba790
    R10: 0000000000000002 R11: ffffed00234b9ef1 R12: ffff880114843980
    R13: ffffffff84213c00 R14: ffff880114843ab0 R15: 0000000000000286
    FS:  00007f72958f3700(0000) GS:ffff88011aa00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000603001 CR3: 00000001126ab000 CR4: 00000000000006f0
    Stack:
     ffff880114843980 ffff880111eb2dc0 ffff880114843a34 ffff8801120c7ad0
     ffffffff82c81ab1 0000000000000000 ffffffff842138e0 0000000100000000
     ffff880111eb2dd0 ffff880111eb2dc0 0000000000000001 ffff880111eb2dc0
    Call Trace:
     [<ffffffff82c81ab1>] snd_timer_start1+0x331/0x670
     [<ffffffff82c85bfd>] snd_timer_start+0x5d/0xa0
     [<ffffffff82c8795e>] snd_timer_user_ioctl+0x88e/0x2830
     [<ffffffff8159f3a0>] ? __follow_pte.isra.49+0x430/0x430
     [<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80
     [<ffffffff815a26fa>] ? do_wp_page+0x3aa/0x1c90
     [<ffffffff8132762f>] ? put_prev_entity+0x108f/0x21a0
     [<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80
     [<ffffffff816b0733>] do_vfs_ioctl+0x193/0x1050
     [<ffffffff813510af>] ? cpuacct_account_field+0x12f/0x1a0
     [<ffffffff816b05a0>] ? ioctl_preallocate+0x200/0x200
     [<ffffffff81002f2f>] ? syscall_trace_enter+0x3cf/0xdb0
     [<ffffffff815045ba>] ? __context_tracking_exit.part.4+0x9a/0x1e0
     [<ffffffff81002b60>] ? exit_to_usermode_loop+0x190/0x190
     [<ffffffff82001a97>] ? check_preemption_disabled+0x37/0x1e0
     [<ffffffff81d93889>] ? security_file_ioctl+0x89/0xb0
     [<ffffffff816b167f>] SyS_ioctl+0x8f/0xc0
     [<ffffffff816b15f0>] ? do_vfs_ioctl+0x1050/0x1050
     [<ffffffff81005524>] do_syscall_64+0x1c4/0x4e0
     [<ffffffff83c32b2a>] entry_SYSCALL64_slow_path+0x25/0x25
    Code: c7 c7 c4 b9 c8 82 48 89 d9 4c 89 ee e8 63 88 7f fe e8 7e 46 7b fe 48 8d 7b 48 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 04 84 c0 7e 65 80 7b 48 00 74 0e e8 52 46
    RIP  [<ffffffff82c8ba07>] snd_hrtimer_start+0x77/0x100
     RSP <ffff8801120c7a60>
    ---[ end trace 5955b08db7f2b029 ]---

This can happen if snd_hrtimer_open() fails to allocate memory and
returns an error, which is currently not checked by snd_timer_open():

    ioctl(SNDRV_TIMER_IOCTL_SELECT)
     - snd_timer_user_tselect()
	- snd_timer_close()
	   - snd_hrtimer_close()
	      - (struct snd_timer *) t->private_data = NULL
        - snd_timer_open()
           - snd_hrtimer_open()
              - kzalloc() fails; t->private_data is still NULL

    ioctl(SNDRV_TIMER_IOCTL_START)
     - snd_timer_user_start()
	- snd_timer_start()
	   - snd_timer_start1()
	      - snd_hrtimer_start()
		- t->private_data == NULL // boom

Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
liuyuan10 pushed a commit to liuyuan10/linux that referenced this pull request Nov 22, 2016
If we hit the error path, we have never called drm_encoder_init() and so
have nothing to cleanup. Doing so hits a null dereference:

[   10.066261] BUG: unable to handle kernel NULL pointer dereference at 00000104
[   10.066273] IP: [<c16054b4>] mutex_lock+0xa/0x15
[   10.066287] *pde = 00000000
[   10.066295] Oops: 0002 [lkl#1]
[   10.066302] Modules linked in: i915(+) video i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm iTCO_wdt iTCO_vendor_support ppdev evdev snd_intel8x0 snd_ac97_codec ac97_bus psmouse snd_pcm snd_timer snd pcspkr uhci_hcd ehci_pci soundcore sr_mod ehci_hcd serio_raw i2c_i801 usbcore i2c_smbus cdrom lpc_ich mfd_core rng_core e100 mii floppy parport_pc parport acpi_cpufreq button processor usb_common eeprom lm85 hwmon_vid autofs4
[   10.066378] CPU: 0 PID: 132 Comm: systemd-udevd Not tainted 4.8.0-rc3-00013-gef0e1ea lkl#34
[   10.066389] Hardware name: MicroLink                               /D865GLC                        , BIOS BF86510A.86A.0077.P25.0508040031 08/04/2005
[   10.066401] task: f62db800 task.stack: f5970000
[   10.066409] EIP: 0060:[<c16054b4>] EFLAGS: 00010286 CPU: 0
[   10.066417] EIP is at mutex_lock+0xa/0x15
[   10.066424] EAX: 00000104 EBX: 00000104 ECX: 00000000 EDX: 80000000
[   10.066432] ESI: 00000000 EDI: 00000104 EBP: f5be8000 ESP: f5971b58
[   10.066439]  DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
[   10.066446] CR0: 80050033 CR2: 00000104 CR3: 35945000 CR4: 000006d0
[   10.066453] Stack:
[   10.066459]  f503d740 f824dddf 00000000 f61170c0 f61170c0 f82371ae f850f40e 00000001
[   10.066476]  f61170c0 f5971bcc f5be8000 f9c2d401 00000001 f8236fcc 00000001 00000000
[   10.066491]  f5144014 f5be8104 00000008 f9c5267c 00000007 f61170c0 f5144400 f9c4ff00
[   10.066507] Call Trace:
[   10.066526]  [<f824dddf>] ? drm_modeset_lock_all+0x27/0xb3 [drm]
[   10.066545]  [<f82371ae>] ? drm_encoder_cleanup+0x1a/0x132 [drm]
[   10.066559]  [<f850f40e>] ? drm_atomic_helper_connector_reset+0x3f/0x5c [drm_kms_helper]
[   10.066644]  [<f9c2d401>] ? intel_dvo_init+0x569/0x788 [i915]
[   10.066663]  [<f8236fcc>] ? drm_encoder_init+0x43/0x20b [drm]
[   10.066734]  [<f9bf1fce>] ? intel_modeset_init+0x1436/0x17dd [i915]
[   10.066791]  [<f9b37636>] ? i915_driver_load+0x85a/0x15d3 [i915]
[   10.066846]  [<f9b3603d>] ? i915_driver_open+0x5/0x5 [i915]
[   10.066857]  [<c14af4d0>] ? firmware_map_add_entry.part.2+0xc/0xc
[   10.066868]  [<c1343daf>] ? pci_device_probe+0x8e/0x11c
[   10.066878]  [<c140cec8>] ? driver_probe_device+0x1db/0x62e
[   10.066888]  [<c120c010>] ? kernfs_new_node+0x29/0x9c
[   10.066897]  [<c13438e0>] ? pci_match_device+0xd9/0x161
[   10.066905]  [<c120c48b>] ? kernfs_create_dir_ns+0x42/0x88
[   10.066914]  [<c140d401>] ? __driver_attach+0xe6/0x11b
[   10.066924]  [<c1303b13>] ? kobject_add_internal+0x1bb/0x44f
[   10.066933]  [<c140d31b>] ? driver_probe_device+0x62e/0x62e
[   10.066941]  [<c140a2d2>] ? bus_for_each_dev+0x46/0x7f
[   10.066950]  [<c140c502>] ? driver_attach+0x1a/0x34
[   10.066958]  [<c140d31b>] ? driver_probe_device+0x62e/0x62e
[   10.066966]  [<c140b758>] ? bus_add_driver+0x217/0x32a
[   10.066975]  [<f8403000>] ? 0xf8403000
[   10.066982]  [<c140de27>] ? driver_register+0x5f/0x108
[   10.066991]  [<c1000493>] ? do_one_initcall+0x49/0x1f6
[   10.067000]  [<c1082299>] ? pick_next_task_fair+0x14b/0x2a3
[   10.067008]  [<c1603c8d>] ? __schedule+0x15c/0x4fe
[   10.067016]  [<c1604104>] ? preempt_schedule_common+0x19/0x3c
[   10.067027]  [<c11051de>] ? do_init_module+0x17/0x230
[   10.067035]  [<c1604139>] ? _cond_resched+0x12/0x1a
[   10.067044]  [<c116f9aa>] ? kmem_cache_alloc+0x8f/0x11f
[   10.067052]  [<c11051de>] ? do_init_module+0x17/0x230
[   10.067060]  [<c11703dd>] ? kfree+0x137/0x203
[   10.067068]  [<c110523d>] ? do_init_module+0x76/0x230
[   10.067078]  [<c10cadf3>] ? load_module+0x2a39/0x333f
[   10.067087]  [<c10cb8b2>] ? SyS_finit_module+0x96/0xd5
[   10.067096]  [<c1132231>] ? vm_mmap_pgoff+0x79/0xa0
[   10.067105]  [<c1001e96>] ? do_fast_syscall_32+0xb5/0x1b0
[   10.067114]  [<c16086a6>] ? sysenter_past_esp+0x47/0x75
[   10.067121] Code: c8 f7 76 c1 e8 8e cc d2 ff e9 45 fe ff ff 66 90 66 90 66 90 66 90 90 ff 00 7f 05 e8 4e 0c 00 00 c3 53 89 c3 e8 75 ec ff ff 89 d8 <ff> 08 79 05 e8 fa 0a 00 00 5b c3 53 89 c3 85 c0 74 1b 8b 03 83
[   10.067180] EIP: [<c16054b4>] mutex_lock+0xa/0x15 SS:ESP 0068:f5971b58
[   10.067190] CR2: 0000000000000104
[   10.067222] ---[ end trace 049f1f09da45a856 ]---

Reported-by: Meelis Roos <mroos@linux.ee>
Fixes: 580d8ed ("drm/i915: Give encoders useful names")
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: drm-intel-fixes@lists.freedesktop.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160823092558.14931-1-chris@chris-wilson.co.uk
(cherry picked from commit 8f76aa0)
liuyuan10 pushed a commit to liuyuan10/linux that referenced this pull request Nov 22, 2016
There are three usercopy warnings which are currently being silenced for
gcc 4.6 and newer:

1) "copy_from_user() buffer size is too small" compile warning/error

   This is a static warning which happens when object size and copy size
   are both const, and copy size > object size.  I didn't see any false
   positives for this one.  So the function warning attribute seems to
   be working fine here.

   Note this scenario is always a bug and so I think it should be
   changed to *always* be an error, regardless of
   CONFIG_DEBUG_STRICT_USER_COPY_CHECKS.

2) "copy_from_user() buffer size is not provably correct" compile warning

   This is another static warning which happens when I enable
   __compiletime_object_size() for new compilers (and
   CONFIG_DEBUG_STRICT_USER_COPY_CHECKS).  It happens when object size
   is const, but copy size is *not*.  In this case there's no way to
   compare the two at build time, so it gives the warning.  (Note the
   warning is a byproduct of the fact that gcc has no way of knowing
   whether the overflow function will be called, so the call isn't dead
   code and the warning attribute is activated.)

   So this warning seems to only indicate "this is an unusual pattern,
   maybe you should check it out" rather than "this is a bug".

   I get 102(!) of these warnings with allyesconfig and the
   __compiletime_object_size() gcc check removed.  I don't know if there
   are any real bugs hiding in there, but from looking at a small
   sample, I didn't see any.  According to Kees, it does sometimes find
   real bugs.  But the false positive rate seems high.

3) "Buffer overflow detected" runtime warning

   This is a runtime warning where object size is const, and copy size >
   object size.

All three warnings (both static and runtime) were completely disabled
for gcc 4.6 with the following commit:

  2fb0815 ("gcc4: disable __compiletime_object_size for GCC 4.6+")

That commit mistakenly assumed that the false positives were caused by a
gcc bug in __compiletime_object_size().  But in fact,
__compiletime_object_size() seems to be working fine.  The false
positives were instead triggered by lkl#2 above.  (Though I don't have an
explanation for why the warnings supposedly only started showing up in
gcc 4.6.)

So remove warning lkl#2 to get rid of all the false positives, and re-enable
warnings lkl#1 and lkl#3 by reverting the above commit.

Furthermore, since lkl#1 is a real bug which is detected at compile time,
upgrade it to always be an error.

Having done all that, CONFIG_DEBUG_STRICT_USER_COPY_CHECKS is no longer
needed.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
liuyuan10 pushed a commit to liuyuan10/linux that referenced this pull request Nov 22, 2016
On handling amsdu on rx path, get the rx_status from htt context. Without this
fix, we are seeing warnings when running DBDC traffic like this.

WARNING: CPU: 0 PID: 0 at net/mac80211/rx.c:4105 ieee80211_rx_napi+0x88/0x7d8 [mac80211]()

[ 1715.878248] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.18.21 lkl#1
[ 1715.878273] [<c001d3f4>] (unwind_backtrace) from [<c001a4b0>] (show_stack+0x10/0x14)
[ 1715.878293] [<c001a4b0>] (show_stack) from [<c01bee64>] (dump_stack+0x70/0xbc)
[ 1715.878315] [<c01bee64>] (dump_stack) from [<c002a61c>] (warn_slowpath_common+0x64/0x88)
[ 1715.878339] [<c002a61c>] (warn_slowpath_common) from [<c002a6d0>] (warn_slowpath_null+0x18/0x20)
[ 1715.878395] [<c002a6d0>] (warn_slowpath_null) from [<bf4caa98>] (ieee80211_rx_napi+0x88/0x7d8 [mac80211])
[ 1715.878474] [<bf4caa98>] (ieee80211_rx_napi [mac80211]) from [<bf568658>] (ath10k_htt_t2h_msg_handler+0xb48/0xbfc [ath10k_core])
[ 1715.878535] [<bf568658>] (ath10k_htt_t2h_msg_handler [ath10k_core]) from [<bf568708>] (ath10k_htt_t2h_msg_handler+0xbf8/0xbfc [ath10k_core])
[ 1715.878597] [<bf568708>] (ath10k_htt_t2h_msg_handler [ath10k_core]) from [<bf569160>] (ath10k_htt_txrx_compl_task+0xa54/0x1170 [ath10k_core])
[ 1715.878639] [<bf569160>] (ath10k_htt_txrx_compl_task [ath10k_core]) from [<c002db14>] (tasklet_action+0xb4/0x130)
[ 1715.878659] [<c002db14>] (tasklet_action) from [<c002d110>] (__do_softirq+0xe0/0x210)
[ 1715.878678] [<c002d110>] (__do_softirq) from [<c002d4b4>] (irq_exit+0x84/0xe0)
[ 1715.878700] [<c002d4b4>] (irq_exit) from [<c005a544>] (__handle_domain_irq+0x98/0xd0)
[ 1715.878722] [<c005a544>] (__handle_domain_irq) from [<c00085f4>] (gic_handle_irq+0x38/0x5c)
[ 1715.878741] [<c00085f4>] (gic_handle_irq) from [<c0009680>] (__irq_svc+0x40/0x74)
[ 1715.878753] Exception stack(0xc05f9f50 to 0xc05f9f98)
[ 1715.878767] 9f40: ffffffed 00000000 00399e1e c000a220
[ 1715.878786] 9f60: 00000000 c05f6780 c05f8000 00000000 c05f5db8 ffffffed c05f8000 c04d1980
[ 1715.878802] 9f80: 00000000 c05f9f98 c0018110 c0018114 60000013 ffffffff
[ 1715.878822] [<c0009680>] (__irq_svc) from [<c0018114>] (arch_cpu_idle+0x2c/0x50)
[ 1715.878844] [<c0018114>] (arch_cpu_idle) from [<c00530d4>] (cpu_startup_entry+0x108/0x234)
[ 1715.878866] [<c00530d4>] (cpu_startup_entry) from [<c05c7be0>] (start_kernel+0x33c/0x3b8)
[ 1715.878879] ---[ end trace 6d5e1cc0fef8ed6a ]---
[ 1715.878899] ------------[ cut here ]------------

Fixes: 1823566 ("ath10k: cleanup amsdu processing for rx indication")
Signed-off-by: Ashok Raj Nagarajan <arnagara@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
liuyuan10 pushed a commit to liuyuan10/linux that referenced this pull request Nov 22, 2016
The warning was seen on AR5416 chip, which invoke ath9k_hw_gio_get()
before the GPIO initialized correctly.

    WARNING: CPU: 1 PID: 1159 at ~/drivers/net/wireless/ath/ath9k/hw.c:2776 ath9k_hw_gpio_get+0x148/0x1a0 [ath9k_hw]
    ...
    CPU: 1 PID: 1159 Comm: systemd-udevd Not tainted 4.7.0-rc7-aptosid-amd64 lkl#1 aptosid 4.7~rc7-1~git92.slh.3
    Hardware name:                  /DH67CL, BIOS BLH6710H.86A.0160.2012.1204.1156 12/04/2012
      0000000000000286 00000000f912d633 ffffffff81290fd3 0000000000000000
      0000000000000000 ffffffff81063fd4 ffff88040c6dc018 0000000000000000
      0000000000000002 0000000000000000 0000000000000100 ffff88040c6dc018
    Call Trace:
      [<ffffffff81290fd3>] ? dump_stack+0x5c/0x79
      [<ffffffff81063fd4>] ? __warn+0xb4/0xd0
      [<ffffffffa0668fb8>] ? ath9k_hw_gpio_get+0x148/0x1a0 [ath9k_hw]

Signed-off-by: Miaoqing Pan <miaoqing@codeaurora.org>
Reported-by: Stefan Lippers-Hollmann <s.l-h@gmx.de>
Tested-by: Stefan Lippers-Hollmann <s.l-h@gmx.de>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
liuyuan10 pushed a commit to liuyuan10/linux that referenced this pull request Nov 22, 2016
According to the CI test machines, SNB also uses the
GEN7_PCODE_MIN_FREQ_TABLE_GT_RATIO_OUT_OF_RANGE value to report a bad
GEN6_PCODE_MIN_FREQ_TABLE request.

[  157.744641] WARNING: CPU: 5 PID: 9238 at
drivers/gpu/drm/i915/intel_pm.c:7760 sandybridge_pcode_write+0x141/0x200 [i915]
[  157.744642] Missing switch case (16) in gen6_check_mailbox_status
[  157.744642] Modules linked in: snd_hda_intel i915 ax88179_178a usbnet mii x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core mei_me lpc_ich snd_pcm mei broadcom bcm_phy_lib tg3 ptp pps_core [last unloaded: vgem]
[  157.744658] CPU: 5 PID: 9238 Comm: drv_hangman Tainted: G     U  W 4.8.0-rc3-CI-CI_DRM_1589+ lkl#1
[  157.744658] Hardware name: Dell Inc. XPS 8300  /0Y2MRG, BIOS A06 10/17/2011
[  157.744659]  0000000000000000 ffff88011f093a98 ffffffff81426415 ffff88011f093ae8
[  157.744662]  0000000000000000 ffff88011f093ad8 ffffffff8107d2a6 00001e50810d3c9f
[  157.744663]  ffff880128680000 0000000000000008 0000000000000000 ffff88012868a650
[  157.744665] Call Trace:
[  157.744669]  [<ffffffff81426415>] dump_stack+0x67/0x92
[  157.744672]  [<ffffffff8107d2a6>] __warn+0xc6/0xe0
[  157.744673]  [<ffffffff8107d30a>] warn_slowpath_fmt+0x4a/0x50
[  157.744685]  [<ffffffffa0029831>] sandybridge_pcode_write+0x141/0x200 [i915]
[  157.744697]  [<ffffffffa002a88a>] intel_enable_gt_powersave+0x64a/0x1330 [i915]
[  157.744712]  [<ffffffffa006b4cb>] ? i9xx_emit_request+0x1b/0x80 [i915]
[  157.744725]  [<ffffffffa0055ed3>] __i915_add_request+0x1e3/0x370 [i915]
[  157.744738]  [<ffffffffa00428bd>] i915_gem_do_execbuffer.isra.16+0xced/0x1b80 [i915]
[  157.744740]  [<ffffffff811a232e>] ? __might_fault+0x3e/0x90
[  157.744752]  [<ffffffffa0043b72>] i915_gem_execbuffer2+0xc2/0x2a0 [i915]
[  157.744753]  [<ffffffff815485b7>] drm_ioctl+0x207/0x4c0
[  157.744765]  [<ffffffffa0043ab0>] ? i915_gem_execbuffer+0x360/0x360 [i915]
[  157.744767]  [<ffffffff810ea4ad>] ?  debug_lockdep_rcu_enabled+0x1d/0x20
[  157.744769]  [<ffffffff811fe09e>] do_vfs_ioctl+0x8e/0x680
[  157.744770]  [<ffffffff811a2377>] ? __might_fault+0x87/0x90
[  157.744771]  [<ffffffff811a232e>] ? __might_fault+0x3e/0x90
[  157.744773]  [<ffffffff810d3df2>] ?  trace_hardirqs_on_caller+0x122/0x1b0
[  157.744774]  [<ffffffff811fe6cc>] SyS_ioctl+0x3c/0x70
[  157.744776]  [<ffffffff8180fe69>] entry_SYSCALL_64_fastpath+0x1c/0xac

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97491
Fixes: 8766050 ("drm/i915/gen6+: Interpret mailbox error flags")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lyude <cpaul@redhat.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://patchwork.freedesktop.org/patch/msgid/20160826105926.3413-1-chris@chris-wilson.co.uk
Acked-by: Mika Kuoppala <mika.kuoppala@intel.com>
(cherry picked from commit 7850d1c)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
liuyuan10 pushed a commit to liuyuan10/linux that referenced this pull request Nov 22, 2016
When a user timer instance is continued without the explicit start
beforehand, the system gets eventually zero-division error like:

  divide error: 0000 [lkl#1] SMP DEBUG_PAGEALLOC KASAN
  CPU: 1 PID: 27320 Comm: syz-executor Not tainted 4.8.0-rc3-next-20160825+ lkl#8
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
   task: ffff88003c9b2280 task.stack: ffff880027280000
   RIP: 0010:[<ffffffff858e1a6c>]  [<     inline     >] ktime_divns include/linux/ktime.h:195
   RIP: 0010:[<ffffffff858e1a6c>]  [<ffffffff858e1a6c>] snd_hrtimer_callback+0x1bc/0x3c0 sound/core/hrtimer.c:62
  Call Trace:
   <IRQ>
   [<     inline     >] __run_hrtimer kernel/time/hrtimer.c:1238
   [<ffffffff81504335>] __hrtimer_run_queues+0x325/0xe70 kernel/time/hrtimer.c:1302
   [<ffffffff81506ceb>] hrtimer_interrupt+0x18b/0x420 kernel/time/hrtimer.c:1336
   [<ffffffff8126d8df>] local_apic_timer_interrupt+0x6f/0xe0 arch/x86/kernel/apic/apic.c:933
   [<ffffffff86e13056>] smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:957
   [<ffffffff86e1210c>] apic_timer_interrupt+0x8c/0xa0 arch/x86/entry/entry_64.S:487
   <EOI>
   .....

Although a similar issue was spotted and a fix patch was merged in
commit [6b760bb: ALSA: timer: fix division by zero after
SNDRV_TIMER_IOCTL_CONTINUE], it seems covering only a part of
iceberg.

In this patch, we fix the issue a bit more drastically.  Basically the
continue of an uninitialized timer is supposed to be a fresh start, so
we do it for user timers.  For the direct snd_timer_continue() call,
there is no way to pass the initial tick value, so we kick out for the
uninitialized case.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
liuyuan10 pushed a commit to liuyuan10/linux that referenced this pull request Nov 22, 2016
When a seq-virmidi driver is initialized, it registers a rawmidi
instance with its callback to create an associated seq kernel client.
Currently it's done throughly in rawmidi's register_mutex context.
Recently it was found that this may lead to a deadlock another rawmidi
device that is being attached with the sequencer is accessed, as both
open with the same register_mutex.  This was actually triggered by
syzkaller, as Dmitry Vyukov reported:

======================================================
 [ INFO: possible circular locking dependency detected ]
 4.8.0-rc1+ lkl#11 Not tainted
 -------------------------------------------------------
 syz-executor/7154 is trying to acquire lock:
  (register_mutex#5){+.+.+.}, at: [<ffffffff84fd6d4b>] snd_rawmidi_kernel_open+0x4b/0x260 sound/core/rawmidi.c:341

 but task is already holding lock:
  (&grp->list_mutex){++++.+}, at: [<ffffffff850138bb>] check_and_subscribe_port+0x5b/0x5c0 sound/core/seq/seq_ports.c:495

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> lkl#1 (&grp->list_mutex){++++.+}:
    [<ffffffff8147a3a8>] lock_acquire+0x208/0x430 kernel/locking/lockdep.c:3746
    [<ffffffff863f6199>] down_read+0x49/0xc0 kernel/locking/rwsem.c:22
    [<     inline     >] deliver_to_subscribers sound/core/seq/seq_clientmgr.c:681
    [<ffffffff85005c5e>] snd_seq_deliver_event+0x35e/0x890 sound/core/seq/seq_clientmgr.c:822
    [<ffffffff85006e96>] > snd_seq_kernel_client_dispatch+0x126/0x170 sound/core/seq/seq_clientmgr.c:2418
    [<ffffffff85012c52>] snd_seq_system_broadcast+0xb2/0xf0 sound/core/seq/seq_system.c:101
    [<ffffffff84fff70a>] snd_seq_create_kernel_client+0x24a/0x330 sound/core/seq/seq_clientmgr.c:2297
    [<     inline     >] snd_virmidi_dev_attach_seq sound/core/seq/seq_virmidi.c:383
    [<ffffffff8502d29f>] snd_virmidi_dev_register+0x29f/0x750 sound/core/seq/seq_virmidi.c:450
    [<ffffffff84fd208c>] snd_rawmidi_dev_register+0x30c/0xd40 sound/core/rawmidi.c:1645
    [<ffffffff84f816d3>] __snd_device_register.part.0+0x63/0xc0 sound/core/device.c:164
    [<     inline     >] __snd_device_register sound/core/device.c:162
    [<ffffffff84f8235d>] snd_device_register_all+0xad/0x110 sound/core/device.c:212
    [<ffffffff84f7546f>] snd_card_register+0xef/0x6c0 sound/core/init.c:749
    [<ffffffff85040b7f>] snd_virmidi_probe+0x3ef/0x590 sound/drivers/virmidi.c:123
    [<ffffffff833ebf7b>] platform_drv_probe+0x8b/0x170 drivers/base/platform.c:564
    ......

 -> #0 (register_mutex#5){+.+.+.}:
    [<     inline     >] check_prev_add kernel/locking/lockdep.c:1829
    [<     inline     >] check_prevs_add kernel/locking/lockdep.c:1939
    [<     inline     >] validate_chain kernel/locking/lockdep.c:2266
    [<ffffffff814791f4>] __lock_acquire+0x4d44/0x4d80 kernel/locking/lockdep.c:3335
    [<ffffffff8147a3a8>] lock_acquire+0x208/0x430 kernel/locking/lockdep.c:3746
    [<     inline     >] __mutex_lock_common kernel/locking/mutex.c:521
    [<ffffffff863f0ef1>] mutex_lock_nested+0xb1/0xa20 kernel/locking/mutex.c:621
    [<ffffffff84fd6d4b>] snd_rawmidi_kernel_open+0x4b/0x260 sound/core/rawmidi.c:341
    [<ffffffff8502e7c7>] midisynth_subscribe+0xf7/0x350 sound/core/seq/seq_midi.c:188
    [<     inline     >] subscribe_port sound/core/seq/seq_ports.c:427
    [<ffffffff85013cc7>] check_and_subscribe_port+0x467/0x5c0 sound/core/seq/seq_ports.c:510
    [<ffffffff85015da9>] snd_seq_port_connect+0x2c9/0x500 sound/core/seq/seq_ports.c:579
    [<ffffffff850079b8>] snd_seq_ioctl_subscribe_port+0x1d8/0x2b0 sound/core/seq/seq_clientmgr.c:1480
    [<ffffffff84ffe9e4>] snd_seq_do_ioctl+0x184/0x1e0 sound/core/seq/seq_clientmgr.c:2225
    [<ffffffff84ffeae8>] snd_seq_kernel_client_ctl+0xa8/0x110 sound/core/seq/seq_clientmgr.c:2440
    [<ffffffff85027664>] snd_seq_oss_midi_open+0x3b4/0x610 sound/core/seq/oss/seq_oss_midi.c:375
    [<ffffffff85023d67>] snd_seq_oss_synth_setup_midi+0x107/0x4c0 sound/core/seq/oss/seq_oss_synth.c:281
    [<ffffffff8501b0a8>] snd_seq_oss_open+0x748/0x8d0 sound/core/seq/oss/seq_oss_init.c:274
    [<ffffffff85019d8a>] odev_open+0x6a/0x90 sound/core/seq/oss/seq_oss.c:138
    [<ffffffff84f7040f>] soundcore_open+0x30f/0x640 sound/sound_core.c:639
    ......

 other info that might help us debug this:

 Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&grp->list_mutex);
                                lock(register_mutex#5);
                                lock(&grp->list_mutex);
   lock(register_mutex#5);

 *** DEADLOCK ***
======================================================

The fix is to simply move the registration parts in
snd_rawmidi_dev_register() to the outside of the register_mutex lock.
The lock is needed only to manage the linked list, and it's not
necessarily to cover the whole initialization process.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
liuyuan10 pushed a commit to liuyuan10/linux that referenced this pull request Nov 22, 2016
The resent conversion of the cpu hotplug support in the uncore driver
introduced a regression due to the way the callbacks are invoked at
initialization time.

The old code called the prepare/starting/online function on each online cpu
as a block. The new code registers the hotplug callbacks in the core for
each state. The core invokes the callbacks at each registration on all
online cpus.

The code implicitely relied on the prepare/starting/online callbacks being
called as combo on a particular cpu, which was not obvious and completely
undocumented.

The resulting subtle wreckage happens due to the way how the uncore code
manages shared data structures for cpus which share an uncore resource in
hardware. The sharing is determined in the cpu starting callback, but the
prepare callback allocates per cpu data for the upcoming cpu because
potential sharing is unknown at this point. If the starting callback finds
a online cpu which shares the hardware resource it takes a refcount on the
percpu data of that cpu and puts the own data structure into a
'free_at_online' pointer of that shared data structure. The online callback
frees that.

With the old model this worked because in a starting callback only one non
unused structure (the one of the starting cpu) was available. The new code
allocates the data structures for all cpus when the prepare callback is
registered.

Now the starting function iterates through all online cpus and looks for a
data structure (skipping its own) which has a matching hardware id. The id
member of the data structure is initialized to 0, but the hardware id can
be 0 as well. The resulting wreckage is:

  CPU0 finds a matching id on CPU1, takes a refcount on CPU1 data and puts
  its own data structure into CPU1s data structure to be freed.

  CPU1 skips CPU0 because the data structure is its allegedly unsued own.
  It finds a matching id on CPU2, takes a refcount on CPU1 data and puts
  its own data structure into CPU2s data structure to be freed.

  ....

Now the online callbacks are invoked.

  CPU0 has a pointer to CPU1s data and frees the original CPU0 data. So
  far so good.

  CPU1 has a pointer to CPU2s data and frees the original CPU1 data, which
  is still referenced by CPU0 ---> Booom

So there are two issues to be solved here:

1) The id field must be initialized at allocation time to a value which
   cannot be a valid hardware id, i.e. -1

   This prevents the above scenario, but now CPU1 and CPU2 both stick their
   own data structure into the free_at_online pointer of CPU0. So we leak
   CPU1s data structure.

2) Fix the memory leak described in lkl#1

   Instead of having a single pointer, use a hlist to enqueue the
   superflous data structures which are then freed by the first cpu
   invoking the online callback.

Ideally we should know the sharing _before_ invoking the prepare callback,
but that's way beyond the scope of this bug fix.

[ tglx: Rewrote changelog ]

Fixes: 96b2bd3 ("perf/x86/amd/uncore: Convert to hotplug state machine")
Reported-and-tested-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20160909160822.lowgmkdwms2dheyv@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
liuyuan10 pushed a commit to liuyuan10/linux that referenced this pull request Nov 22, 2016
On lubbock board, the probe of the driver crashes by dereferencing very
early a platform_data structure which is not set, in
pxa2xx_configure_sockets().

The stack fixed is :
[    0.244353] SA1111 Microprocessor Companion Chip: silicon revision 1, metal revision 1
[    0.256321] sa1111 sa1111: Providing IRQ336-390
[    0.340899] clocksource: Switched to clocksource oscr0
[    0.472263] Unable to handle kernel NULL pointer dereference at virtual address 00000004
[    0.480469] pgd = c0004000
[    0.483432] [00000004] *pgd=00000000
[    0.487105] Internal error: Oops: f5 [lkl#1] ARM
[    0.491497] Modules linked in:
[    0.494650] CPU: 0 PID: 1 Comm: swapper Not tainted 4.8.0-rc3-00080-g1aaa68426f0c-dirty #2068
[    0.503229] Hardware name: Intel DBPXA250 Development Platform (aka Lubbock)
[    0.510344] task: c3e42000 task.stack: c3e44000
[    0.514984] PC is at pxa2xx_configure_sockets+0x4/0x24 (drivers/pcmcia/pxa2xx_base.c:227)
[    0.520193] LR is at pcmcia_lubbock_init+0x1c/0x38
[    0.525079] pc : [<c0247c30>]    lr : [<c02479b0>]    psr: a0000053
[    0.525079] sp : c3e45e70  ip : 100019ff  fp : 00000000
[    0.536651] r10: c0828900  r9 : c0434838  r8 : 00000000
[    0.541953] r7 : c0820700  r6 : c0857b30  r5 : c3ec1400  r4 : c0820758
[    0.548549] r3 : 00000000  r2 : 0000000c  r1 : c3c09c40  r0 : c3ec1400
[    0.555154] Flags: NzCv  IRQs on  FIQs off  Mode SVC_32  ISA ARM  Segment none
[    0.562450] Control: 0000397f  Table: a0004000  DAC: 00000053
[    0.568257] Process swapper (pid: 1, stack limit = 0xc3e44190)
[    0.574154] Stack: (0xc3e45e70 to 0xc3e46000)
[    0.578610] 5e60:                                     c4849800 00000000 c3ec1400 c024769c
[    0.586928] 5e80: 00000000 c3ec140c c3c0ee0c c3ec1400 c3ec1434 c020c41 c3ec1400 c3ec1434
[    0.595244] 5ea0: c0820700 c080b408 c0828900 c020c5f8 00000000 c0820700 c020c57 c020ac5c
[    0.603560] 5ec0: c3e687cc c3e71e10 c0820700 00000000 c3c02de0 c020bae4 c03c62f7 c03c62f7
[    0.611872] 5ee0: c3e68780 c0820700 c042e034 00000000 c043c440 c020cdec c080b408 00000005
[    0.620188] 5f00: c042e034 c00096c0 c0034440 c01c730c 20000053 ffffffff 00000000 00000000
[    0.628502] 5f20: 00000000 c3ffcb87 c3ffcb90 c00346ac c3e66ba0 c03f7914 00000092 00000005
[    0.636811] 5f40: 00000005 c03f847c 00000091 c03f847c 00000000 00000005 c0434828 00000005
[    0.645125] 5f60: c043482c 00000092 c043c440 c0828900 c0434838 c0418d2c 00000005 00000005
[    0.653430] 5f80: 00000000 c041858c 00000000 c032e9f0 00000000 00000000 00000000 00000000
[    0.661729] 5fa0: 00000000 c032e9f8 00000000 c000f0f0 00000000 00000000 00000000 00000000
[    0.670020] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[    0.678311] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[    0.686673] (pxa2xx_configure_sockets) from pcmcia_lubbock_init (/drivers/pcmcia/sa1111_lubbock.c:161)
[    0.696026] (pcmcia_lubbock_init) from pcmcia_probe (/drivers/pcmcia/sa1111_generic.c:213)
[    0.704358] (pcmcia_probe) from driver_probe_device (/drivers/base/dd.c:378 /drivers/base/dd.c:499)
[    0.712848] (driver_probe_device) from __driver_attach (/./include/linux/device.h:983 /drivers/base/dd.c:733)
[    0.721414] (__driver_attach) from bus_for_each_dev (/drivers/base/bus.c:313)
[    0.729723] (bus_for_each_dev) from bus_add_driver (/drivers/base/bus.c:708)
[    0.738036] (bus_add_driver) from driver_register (/drivers/base/driver.c:169)
[    0.746185] (driver_register) from do_one_initcall (/init/main.c:778)
[    0.754561] (do_one_initcall) from kernel_init_freeable (/init/main.c:843 /init/main.c:851 /init/main.c:869 /init/main.c:1016)
[    0.763409] (kernel_init_freeable) from kernel_init (/init/main.c:944)
[    0.771660] (kernel_init) from ret_from_fork (/arch/arm/kernel/entry-common.S:119)
[ 0.779347] Code: c03c6305 c03c631e c03c632e e5903048 (e993000c)
All code
========
   0:	c03c6305 	eorsgt	r6, ip, r5, lsl lkl#6
   4:	c03c631e 	eorsgt	r6, ip, lr, lsl r3
   8:	c03c632e 	eorsgt	r6, ip, lr, lsr lkl#6
   c:	e5903048 	ldr	r3, [r0, lkl#72]	; 0x48
  10:*	e993000c 	ldmib	r3, {r2, r3}		<-- trapping instruction

Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
liuyuan10 pushed a commit to liuyuan10/linux that referenced this pull request Nov 22, 2016
rsc_lookup steals the passed-in memory to avoid doing an allocation of
its own, so we can't just pass in a pointer to memory that someone else
is using.

If we really want to avoid allocation there then maybe we should
preallocate somwhere, or reference count these handles.

For now we should revert.

On occasion I see this on my server:

kernel: kernel BUG at /home/cel/src/linux/linux-2.6/mm/slub.c:3851!
kernel: invalid opcode: 0000 [lkl#1] SMP
kernel: Modules linked in: cts rpcsec_gss_krb5 sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd btrfs xor iTCO_wdt iTCO_vendor_support raid6_pq pcspkr i2c_i801 i2c_smbus lpc_ich mfd_core mei_me sg mei shpchp wmi ioatdma ipmi_si ipmi_msghandler acpi_pad acpi_power_meter rpcrdma ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm nfsd nfs_acl lockd grace auth_rpcgss sunrpc ip_tables xfs libcrc32c mlx4_ib mlx4_en ib_core sr_mod cdrom sd_mod ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel igb mlx4_core ahci libahci libata ptp pps_core dca i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod
kernel: CPU: 7 PID: 145 Comm: kworker/7:2 Not tainted 4.8.0-rc4-00006-g9d06b0b lkl#15
kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
kernel: Workqueue: events do_cache_clean [sunrpc]
kernel: task: ffff8808541d8000 task.stack: ffff880854344000
kernel: RIP: 0010:[<ffffffff811e7075>]  [<ffffffff811e7075>] kfree+0x155/0x180
kernel: RSP: 0018:ffff880854347d70  EFLAGS: 00010246
kernel: RAX: ffffea0020fe7660 RBX: ffff88083f9db064 RCX: 146ff0f9d5ec5600
kernel: RDX: 000077ff80000000 RSI: ffff880853f01500 RDI: ffff88083f9db064
kernel: RBP: ffff880854347d88 R08: ffff8808594ee000 R09: ffff88087fdd8780
kernel: R10: 0000000000000000 R11: ffffea0020fe76c0 R12: ffff880853f01500
kernel: R13: ffffffffa013cf76 R14: ffffffffa013cff0 R15: ffffffffa04253a0
kernel: FS:  0000000000000000(0000) GS:ffff88087fdc0000(0000) knlGS:0000000000000000
kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 00007fed60b020c3 CR3: 0000000001c06000 CR4: 00000000001406e0
kernel: Stack:
kernel: ffff8808589f2f00 ffff880853f01500 0000000000000001 ffff880854347da0
kernel: ffffffffa013cf76 ffff8808589f2f00 ffff880854347db8 ffffffffa013d006
kernel: ffff8808589f2f20 ffff880854347e00 ffffffffa0406f60 0000000057c7044f
kernel: Call Trace:
kernel: [<ffffffffa013cf76>] rsc_free+0x16/0x90 [auth_rpcgss]
kernel: [<ffffffffa013d006>] rsc_put+0x16/0x30 [auth_rpcgss]
kernel: [<ffffffffa0406f60>] cache_clean+0x2e0/0x300 [sunrpc]
kernel: [<ffffffffa04073ee>] do_cache_clean+0xe/0x70 [sunrpc]
kernel: [<ffffffff8109a70f>] process_one_work+0x1ff/0x3b0
kernel: [<ffffffff8109b15c>] worker_thread+0x2bc/0x4a0
kernel: [<ffffffff8109aea0>] ? rescuer_thread+0x3a0/0x3a0
kernel: [<ffffffff810a0ba4>] kthread+0xe4/0xf0
kernel: [<ffffffff8169c47f>] ret_from_fork+0x1f/0x40
kernel: [<ffffffff810a0ac0>] ? kthread_stop+0x110/0x110
kernel: Code: f7 ff ff eb 3b 65 8b 05 da 30 e2 7e 89 c0 48 0f a3 05 a0 38 b8 00 0f 92 c0 84 c0 0f 85 d1 fe ff ff 0f 1f 44 00 00 e9 f5 fe ff ff <0f> 0b 49 8b 03 31 f6 f6 c4 40 0f 85 62 ff ff ff e9 61 ff ff ff
kernel: RIP  [<ffffffff811e7075>] kfree+0x155/0x180
kernel: RSP <ffff880854347d70>
kernel: ---[ end trace 3fdec044969def26 ]---

It seems to be most common after a server reboot where a client has been
using a Kerberos mount, and reconnects to continue its workload.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
liuyuan10 pushed a commit to liuyuan10/linux that referenced this pull request Nov 22, 2016
A discrepancy between cpu_online_mask and cpuset's effective_cpus
mask is inevitable during hotplug since cpuset defers updating of
effective_cpus mask using a workqueue, during which time nothing
prevents the system from more hotplug operations.  For that reason
guarantee_online_cpus() walks up the cpuset hierarchy until it finds
an intersection under the assumption that top cpuset's effective_cpus
mask intersects with cpu_online_mask even with such a race occurring.

However a sequence of CPU hotplugs can open a time window, during which
none of the effective CPUs in the top cpuset intersect with
cpu_online_mask.

For example when there are 4 possible CPUs 0-3 and only CPU0 is online:

  ========================  ===========================
   cpu_online_mask           top_cpuset.effective_cpus
  ========================  ===========================
   echo 1 > cpu2/online.
   CPU hotplug notifier woke up hotplug work but not yet scheduled.
      [0,2]                     [0]

   echo 0 > cpu0/online.
   The workqueue is still runnable.
      [2]                       [0]
  ========================  ===========================

  Now there is no intersection between cpu_online_mask and
  top_cpuset.effective_cpus.  Thus invoking sys_sched_setaffinity() at
  this moment can cause following:

   Unable to handle kernel NULL pointer dereference at virtual address 000000d0
   ------------[ cut here ]------------
   Kernel BUG at ffffffc0001389b0 [verbose debug info unavailable]
   Internal error: Oops - BUG: 96000005 [lkl#1] PREEMPT SMP
   Modules linked in:
   CPU: 2 PID: 1420 Comm: taskset Tainted: G        W       4.4.8+ lkl#98
   task: ffffffc06a5c4880 ti: ffffffc06e124000 task.ti: ffffffc06e124000
   PC is at guarantee_online_cpus+0x2c/0x58
   LR is at cpuset_cpus_allowed+0x4c/0x6c
   <snip>
   Process taskset (pid: 1420, stack limit = 0xffffffc06e124020)
   Call trace:
   [<ffffffc0001389b0>] guarantee_online_cpus+0x2c/0x58
   [<ffffffc00013b208>] cpuset_cpus_allowed+0x4c/0x6c
   [<ffffffc0000d61f0>] sched_setaffinity+0xc0/0x1ac
   [<ffffffc0000d6374>] SyS_sched_setaffinity+0x98/0xac
   [<ffffffc000085cb0>] el0_svc_naked+0x24/0x28

The top cpuset's effective_cpus are guaranteed to be identical to
cpu_online_mask eventually.  Hence fall back to cpu_online_mask when
there is no intersection between top cpuset's effective_cpus and
cpu_online_mask.

Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: cgroups@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: <stable@vger.kernel.org> # 3.17+
Signed-off-by: Tejun Heo <tj@kernel.org>
liuyuan10 pushed a commit to liuyuan10/linux that referenced this pull request Nov 22, 2016
af_iucv socket programs with HiperSockets as transport make use of the qdio
completion queue. Running such an af_iucv socket program may result in a
crash:

[90341.677709] Oops: 0038 ilc:2 [lkl#1] SMP
[90341.677743] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.6.0-20160720.0.0e86ec7.5e62689.fc23.s390xperformance lkl#1
[90341.677744] Hardware name: IBM              2964 N96              703              (LPAR)
[90341.677746] task: 00000000edb79f00 ti: 00000000edb84000 task.ti: 00000000edb84000
[90341.677748] Krnl PSW : 0704d00180000000 000000000075bc50 (qeth_qdio_input_handler+0x258/0x4e0)
[90341.677756]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
Krnl GPRS: 000003d10391e900 0000000000000001 00000000e61e6000 0000000000000005
[90341.677759]            0000000000a9e6ec 5420040001a77400 0000000000000001 000000000000006f
[90341.677761]            00000000e0d83f00 0000000000000003 0000000000000010 5420040001a77400
[90341.677784]            000000007ba8b000 0000000000943fd0 000000000075bc4e 00000000ed3b3c10
[90341.677793] Krnl Code: 000000000075bc42: e320cc180004        lg      %r2,3096(%r12)
           000000000075bc48: c0e5ffffc5cc       brasl   %r14,7547e0
          #000000000075bc4e: 1816               lr      %r1,%r6
          >000000000075bc50: ba19b008           cs      %r1,%r9,8(%r11)
           000000000075bc54: ec180041017e       cij     %r1,1,8,75bcd6
           000000000075bc5a: 5810b008           l       %r1,8(%r11)
           000000000075bc5e: ec16005c027e       cij     %r1,2,6,75bd16
           000000000075bc64: 5090b008           st      %r9,8(%r11)
[90341.677807] Call Trace:
[90341.677810] ([<000000000075bbc0>] qeth_qdio_input_handler+0x1c8/0x4e0)
[90341.677812] ([<000000000070efbc>] qdio_kick_handler+0x124/0x2a8)
[90341.677814] ([<0000000000713570>] __tiqdio_inbound_processing+0xf0/0xcd0)
[90341.677818] ([<0000000000143312>] tasklet_action+0x92/0x120)
[90341.677823] ([<00000000008b6e72>] __do_softirq+0x112/0x308)
[90341.677824] ([<0000000000142bce>] irq_exit+0xd6/0xf8)
[90341.677829] ([<000000000010b1d2>] do_IRQ+0x6a/0x88)
[90341.677830] ([<00000000008b6322>] io_int_handler+0x112/0x220)
[90341.677832] ([<0000000000102b2e>] enabled_wait+0x56/0xa8)
[90341.677833] ([<0000000000000000>]           (null))
[90341.677835] ([<0000000000102e32>] arch_cpu_idle+0x32/0x48)
[90341.677838] ([<000000000018a126>] cpu_startup_entry+0x266/0x2b0)
[90341.677841] ([<0000000000113b38>] smp_start_secondary+0x100/0x110)
[90341.677843] ([<00000000008b68a6>] restart_int_handler+0x62/0x78)
[90341.677845] ([<00000000008b6588>] psw_idle+0x3c/0x40)
[90341.677846] Last Breaking-Event-Address:
[90341.677848]  [<00000000007547ec>] qeth_dbf_longtext+0xc/0xc0
[90341.677849]
[90341.677850] Kernel panic - not syncing: Fatal exception in interrupt

qeth_qdio_cq_handler() analyzes SBALs on this completion queue, but does
not observe the limit of 16 SBAL elements per SBAL. This patch adds the
additional check to process not more than 16 SBAL elements.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
liuyuan10 pushed a commit to liuyuan10/linux that referenced this pull request Nov 22, 2016
Disable creation of a UDP socket for ipv6 when
CONFIG_IPV6 is not enabeld. Since udp_sock_create6()
returns 0 when CONFIG_IPV6 is not set

[   46.888632] IP: [<c220705a>] setup_udp_tunnel_sock+0x6/0x4f
[   46.891355] *pdpt = 0000000000000000 *pde = f000ff53f000ff53
[   46.893918] Oops: 0002 [lkl#1] PREEMPT
[   46.896014] CPU: 0 PID: 1 Comm: swapper Not tainted 4.7.0-rc4-00001-g8700e3e lkl#1
[   46.900280] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
[   46.904905] task: cf06c040 ti: cf05e000 task.ti: cf05e000
[   46.907854] EIP: 0060:[<c220705a>] EFLAGS: 00210246 CPU: 0
[   46.911137] EIP is at setup_udp_tunnel_sock+0x6/0x4f
[   46.914070] EAX: 00000044 EBX: 00000001 ECX: cf05fef0 EDX: ca8142e0
[   46.917236] ESI: c2c4505b EDI: cf05fef0 EBP: cf05fed0 ESP: cf05fed0
[   46.919836]  DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
[   46.922046] CR0: 80050033 CR2: 000001fc CR3: 02cec000 CR4: 000006b0
[   46.924550] Stack:
[   46.926014]  cf05ff10 c1fd4657 ca8142e0 0000000a 00000000 00000000 0000b712 00000008
[   46.931274]  00000000 6bb5bd01 c1fd48de 00000000 00000000 cf05ff1c 00000000 00000000
[   46.936122]  cf05ff1c c1fd4bdf 00000000 cf05ff28 c2c4507b ffffffff cf05ff88 c2bf1c74
[   46.942350] Call Trace:
[   46.944403]  [<c1fd4657>] rxe_setup_udp_tunnel+0x8f/0x99
[   46.947689]  [<c1fd48de>] ? net_to_rxe+0x4e/0x4e
[   46.950567]  [<c1fd4bdf>] rxe_net_init+0xe/0xa4
[   46.953147]  [<c2c4507b>] rxe_module_init+0x20/0x4c
[   46.955448]  [<c2bf1c74>] do_one_initcall+0x89/0x113
[   46.957797]  [<c2bf15eb>] ? set_debug_rodata+0xf/0xf
[   46.959966]  [<c2bf1dbc>] ? kernel_init_freeable+0xbe/0x15b
[   46.962262]  [<c2bf1ddc>] kernel_init_freeable+0xde/0x15b
[   46.964418]  [<c232eb54>] kernel_init+0x8/0xd0
[   46.966618]  [<c2333122>] ret_from_kernel_thread+0xe/0x24
[   46.969592]  [<c232eb4c>] ? rest_init+0x6f/0x6f

Fixes: 8700e3e ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
liuyuan10 pushed a commit to liuyuan10/linux that referenced this pull request Nov 22, 2016
Since commit 4d4c474 ("perf/x86/intel/bts: Fix BTS PMI detection")
my box goes boom on boot:

| .... node  #0, CPUs:      lkl#1 lkl#2 lkl#3 lkl#4 lkl#5 lkl#6 lkl#7
| BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
| IP: [<ffffffff8100c463>] intel_bts_interrupt+0x43/0x130
| Call Trace:
|  <NMI> d [<ffffffff8100b341>] intel_pmu_handle_irq+0x51/0x4b0
|  [<ffffffff81004d47>] perf_event_nmi_handler+0x27/0x40

This happens because the code introduced in this commit dereferences the
debug store pointer unconditionally. The debug store is not guaranteed to
be available, so a NULL pointer check as on other places is required.

Fixes: 4d4c474 ("perf/x86/intel/bts: Fix BTS PMI detection")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: vince@deater.net
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/20160920131220.xg5pbdjtznszuyzb@breakpoint.cc
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
liuyuan10 pushed a commit to liuyuan10/linux that referenced this pull request Nov 22, 2016
While running a compile on arm64, I hit a memory exposure

usercopy: kernel memory exposure attempt detected from fffffc0000f3b1a8 (buffer_head) (1 bytes)
------------[ cut here ]------------
kernel BUG at mm/usercopy.c:75!
Internal error: Oops - BUG: 0 [lkl#1] SMP
Modules linked in: ip6t_rpfilter ip6t_REJECT
nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_broute bridge stp
llc ebtable_nat ip6table_security ip6table_raw ip6table_nat
nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle
iptable_security iptable_raw iptable_nat nf_conntrack_ipv4
nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle
ebtable_filter ebtables ip6table_filter ip6_tables vfat fat xgene_edac
xgene_enet edac_core i2c_xgene_slimpro i2c_core at803x realtek xgene_dma
mdio_xgene gpio_dwapb gpio_xgene_sb xgene_rng mailbox_xgene_slimpro nfsd
auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c sdhci_of_arasan
sdhci_pltfm sdhci mmc_core xhci_plat_hcd gpio_keys
CPU: 0 PID: 19744 Comm: updatedb Tainted: G        W 4.8.0-rc3-threadinfo+ lkl#1
Hardware name: AppliedMicro X-Gene Mustang Board/X-Gene Mustang Board, BIOS 3.06.12 Aug 12 2016
task: fffffe03df944c00 task.stack: fffffe00d128c000
PC is at __check_object_size+0x70/0x3f0
LR is at __check_object_size+0x70/0x3f0
...
[<fffffc00082b4280>] __check_object_size+0x70/0x3f0
[<fffffc00082cdc30>] filldir64+0x158/0x1a0
[<fffffc0000f327e8>] __fat_readdir+0x4a0/0x558 [fat]
[<fffffc0000f328d4>] fat_readdir+0x34/0x40 [fat]
[<fffffc00082cd8f8>] iterate_dir+0x190/0x1e0
[<fffffc00082cde58>] SyS_getdents64+0x88/0x120
[<fffffc0008082c70>] el0_svc_naked+0x24/0x28

fffffc0000f3b1a8 is a module address. Modules may have compiled in
strings which could get copied to userspace. In this instance, it
looks like "." which matches with a size of 1 byte. Extend the
is_vmalloc_addr check to be is_vmalloc_or_module_addr to cover
all possible cases.

Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
liuyuan10 pushed a commit to liuyuan10/linux that referenced this pull request Nov 22, 2016
The wq_numa_init() function makes a private CPU to node map by calling
cpu_to_node() early in the boot process, before the non-boot CPUs are
brought online.  Since the default implementation of cpu_to_node()
returns zero for CPUs that have never been brought online, the
workqueue system's view is that *all* CPUs are on node zero.

When the unbound workqueue for a non-zero node is created, the
tsk_cpus_allowed() for the worker threads is the empty set because
there are, in the view of the workqueue system, no CPUs on non-zero
nodes.  The code in try_to_wake_up() using this empty cpumask ends up
using the cpumask empty set value of NR_CPUS as an index into the
per-CPU area pointer array, and gets garbage as it is one past the end
of the array.  This results in:

[    0.881970] Unable to handle kernel paging request at virtual address fffffb1008b926a4
[    1.970095] pgd = fffffc00094b0000
[    1.973530] [fffffb1008b926a4] *pgd=0000000000000000, *pud=0000000000000000, *pmd=0000000000000000
[    1.982610] Internal error: Oops: 96000004 [lkl#1] SMP
[    1.987541] Modules linked in:
[    1.990631] CPU: 48 PID: 295 Comm: cpuhp/48 Tainted: G        W       4.8.0-rc6-preempt-vol+ lkl#9
[    1.999435] Hardware name: Cavium ThunderX CN88XX board (DT)
[    2.005159] task: fffffe0fe89cc300 task.stack: fffffe0fe8b8c000
[    2.011158] PC is at try_to_wake_up+0x194/0x34c
[    2.015737] LR is at try_to_wake_up+0x150/0x34c
[    2.020318] pc : [<fffffc00080e7468>] lr : [<fffffc00080e7424>] pstate: 600000c5
[    2.027803] sp : fffffe0fe8b8fb10
[    2.031149] x29: fffffe0fe8b8fb10 x28: 0000000000000000
[    2.036522] x27: fffffc0008c63bc8 x26: 0000000000001000
[    2.041896] x25: fffffc0008c63c80 x24: fffffc0008bfb200
[    2.047270] x23: 00000000000000c0 x22: 0000000000000004
[    2.052642] x21: fffffe0fe89d25bc x20: 0000000000001000
[    2.058014] x19: fffffe0fe89d1d00 x18: 0000000000000000
[    2.063386] x17: 0000000000000000 x16: 0000000000000000
[    2.068760] x15: 0000000000000018 x14: 0000000000000000
[    2.074133] x13: 0000000000000000 x12: 0000000000000000
[    2.079505] x11: 0000000000000000 x10: 0000000000000000
[    2.084879] x9 : 0000000000000000 x8 : 0000000000000000
[    2.090251] x7 : 0000000000000040 x6 : 0000000000000000
[    2.095621] x5 : ffffffffffffffff x4 : 0000000000000000
[    2.100991] x3 : 0000000000000000 x2 : 0000000000000000
[    2.106364] x1 : fffffc0008be4c24 x0 : ffffff0ffffada80
[    2.111737]
[    2.113236] Process cpuhp/48 (pid: 295, stack limit = 0xfffffe0fe8b8c020)
[    2.120102] Stack: (0xfffffe0fe8b8fb10 to 0xfffffe0fe8b90000)
[    2.125914] fb00:                                   fffffe0fe8b8fb80 fffffc00080e7648
.
.
.
[    2.442859] Call trace:
[    2.445327] Exception stack(0xfffffe0fe8b8f940 to 0xfffffe0fe8b8fa70)
[    2.451843] f940: fffffe0fe89d1d00 0000040000000000 fffffe0fe8b8fb10 fffffc00080e7468
[    2.459767] f960: fffffe0fe8b8f980 fffffc00080e4958 ffffff0ff91ab200 fffffc00080e4b64
[    2.467690] f980: fffffe0fe8b8f9d0 fffffc00080e515c fffffe0fe8b8fa80 0000000000000000
[    2.475614] f9a0: fffffe0fe8b8f9d0 fffffc00080e58e4 fffffe0fe8b8fa80 0000000000000000
[    2.483540] f9c0: fffffe0fe8d10000 0000000000000040 fffffe0fe8b8fa50 fffffc00080e5ac4
[    2.491465] f9e0: ffffff0ffffada80 fffffc0008be4c24 0000000000000000 0000000000000000
[    2.499387] fa00: 0000000000000000 ffffffffffffffff 0000000000000000 0000000000000040
[    2.507309] fa20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[    2.515233] fa40: 0000000000000000 0000000000000000 0000000000000000 0000000000000018
[    2.523156] fa60: 0000000000000000 0000000000000000
[    2.528089] [<fffffc00080e7468>] try_to_wake_up+0x194/0x34c
[    2.533723] [<fffffc00080e7648>] wake_up_process+0x28/0x34
[    2.539275] [<fffffc00080d3764>] create_worker+0x110/0x19c
[    2.544824] [<fffffc00080d69dc>] alloc_unbound_pwq+0x3cc/0x4b0
[    2.550724] [<fffffc00080d6bcc>] wq_update_unbound_numa+0x10c/0x1e4
[    2.557066] [<fffffc00080d7d78>] workqueue_online_cpu+0x220/0x28c
[    2.563234] [<fffffc00080bd288>] cpuhp_invoke_callback+0x6c/0x168
[    2.569398] [<fffffc00080bdf74>] cpuhp_up_callbacks+0x44/0xe4
[    2.575210] [<fffffc00080be194>] cpuhp_thread_fun+0x13c/0x148
[    2.581027] [<fffffc00080dfbac>] smpboot_thread_fn+0x19c/0x1a8
[    2.586929] [<fffffc00080dbd64>] kthread+0xdc/0xf0
[    2.591776] [<fffffc0008083380>] ret_from_fork+0x10/0x50
[    2.597147] Code: b00057e 91304021 91005021 b8626822 (b8606821)
[    2.603464] ---[ end trace 58c0cd36b88802bc ]---
[    2.608138] Kernel panic - not syncing: Fatal exception

Fix by moving call to numa_store_cpu_info() for all CPUs into
smp_prepare_cpus(), which happens before wq_numa_init().  Since
smp_store_cpu_info() now contains only a single function call,
simplify by removing the function and out-lining its contents.

Suggested-by: Robert Richter <rric@kernel.org>
Fixes: 1a2db30 ("arm64, numa: Add NUMA support for arm64 platforms.")
Cc: <stable@vger.kernel.org> # 4.7.x-
Signed-off-by: David Daney <david.daney@cavium.com>
Reviewed-by: Robert Richter <rrichter@cavium.com>
Tested-by: Yisheng Xie <xieyisheng1@huawei.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
@ghost ghost mentioned this pull request Aug 4, 2018
@D0nYu D0nYu mentioned this pull request Dec 11, 2024
@rodionov rodionov mentioned this pull request Dec 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants