Okay. Here is how it works.
First line takes the 32-bit product of x*y. Now we want to take this mod 216 + 1…
…but first we have to take care of the case where x=0 or y=0 (in which case we multiplied by 0 when we should've been multiplying by -1). If this was the case, then the product p will be 0 — this is what the if statement checks for. In that case, 65537-x-y is the correct answer because (x+y) will be either 0+y=y, if x was 0, or x+0=x if y was 0, and 65537 minus that takes multiplies it by -1 or 216 mod 216+1.
Otherwise, we're fine to take the mod. x=p>>16 and y=p take the highest 16 bits and lowest 16 bits of p, respectively. Normally, subtracting these gives the correct answer of p mod 216+1.
The problem occurs if there's a carry during the subtraction, for instance if x=y-x subtracted 17 from 3 which would become 216-(17-3) = 65522. This is checked for by the line if (y<x), since this can only happen if there was a carry. The reason this is a problem is that the carry wraps around mod 216, whereas we want things to wrap around mod 216+1. This is fixed by adding 1 to x, which for some reason the author of that code wrote as adding 65537 to x: this doesn't make a difference, though, since 65537 is 0x10001 which gets truncated to 0x0001 anyway.
There's no special code to make sure that the case of x*y = 216 (which needs to be written as 0x0000) gets treated correctly. But we don't need it. If we accept that we calculate p mod 216+1 right, then if p mod 216+1 = 216, the correct answer will be 216. But this is stored as 0x10000 in the 16-bit variable x, so it will get truncated to 0x0000.