Because we're gonna talk about causality (and locality), it may be natural to start chronologically.
People have known for millenia that the cause precedes its effects. For example, before you were born, your parents had to do a certain thing. This fact, known as causality, became a part of Newtonian physics. Classical, non-relativistic physics had a universal time coordinate t that everyone could agree upon. The events and properties of the objects at time t was thought to only affect the events at times t' where t'>t.
Einstein's special relativity has revolutionized many properties of space and time, but the previous sentence remained true. In fact, it remained true for all inertial observers. Each reference frame has a different time coordinate (and time and space are getting mixed), but the statement always holds. What does the required causality in other frames implies for our frame? Well, it implies that the event B affected by the event A must not only come after A, but it must belong to the future light cone of A: physical influences (such as material objects that can influence something) are not only constrained to propagate from the past to the future, but they are not allowed to travel faster than light.
If you wait for a short time period t, you will only be able to affect the objects that are the distance s=ct from you, or closer. Therefore, roughly speaking, this relativistic version of causality is also called locality (because your influence remains local), and we won't distinguish between the concepts of locality and causality.
General relativity allowed the spacetime to be curved, and the notion of a future light cone had to be modified, too. But for sufficiently small patches of spacetime, the curvature can be neglected and general relativity must always reduce to special relativity for experiments in the "elevator" or other local environments. Therefore we will always mean "special relativity" if we mention that some theories or conclusions are "relativistic".
(In quantum gravity, the metric tensor is a quantum observable. Also, the metric tensor determines the structure of the light cones and the rules for causality, and therefore the causal structure becomes uncertain and confusing. At any rate, string theory is smarter than we are and it is able to avoid these conceptual problems. However, this is not the main topic of this essay.)
OK, what about quantum mechanics that emerged in the 1920s? It described the world in terms of wavefunctions associated with particles. Because the wavefunction is not a real wave, and the electron is always found at one point, Max Born successfully proposed that the wavefunctions are waves of probabilities. Einstein deeply believed that the world was deterministic, or at least the question about the "state of the Universe" was an objective question with a unique answer. Although he was one of the grandfathers of quantum theory, this feature of quantum mechanics was unacceptable for him, and therefore he tried to show that the leaders of quantum mechanics had to be wrong.
Meanwhile, these people like Heisenberg, Bohr, Dirac, Born, Pauli, and others knew perfectly how to predict experiments involving quantum mechanics. On the other hand, in order to "disprove" quantum mechanics, Einstein, Podolsky and Rosen (EPR) prepared a gedanken experiment whose result was supposed to show disagreement with quantum mechanics. Well, they were thinking about a system that splits into two subsystems A,B (a positronium decaying into two photons A,B, for example). In the future, A,B are highly separated. Nevertheless, the predictions what happens with B when we measure it depend on the type of measurement that we perform with A.
For example, the angular momentum conservation law implies that the total angular momentum of two photons in A,B must vanish. That means that either both of them are right-handed, or both of them are left-handed (the opposite momenta imply that two R's or two L's cancel). It would be fine with Einstein if the photons were RR or LL - one of these two choices was chosen already when the positronium decayed. However, quantum mechanics predicts that if we measure the linear polarization of the photon A and then B, we always get the opposite polarizations (either xy, or yx). The correlations exist if we measure the circular polarization but also if we measure the linear polarization.
EPR knew that this follows from quantum mechanics. Einstein found this behavior truly counter-intuitive: if the photons are already in the states RR - which is one of two choices that must occur, he thought that the probability to find the right-handed photon A to be x-polarized is 50 percent, much like the likelihood that it is y-polarized - but these two cases should be totally uncorrelated with the same predictions about the photon B that is very far.
Therefore, Einstein thought, if we measure the linear polarizations of both photons, all four choices xx,xy,yx,yy must have probabilities 25 percent. However, once again, quantum mechanics only predicted xy or yx with probabilities 50 percent. It was able to correlate the two photons in many different ways. If we measure the photon A to be R, then the photon B must suddenly be exactly R, so that the person who measures the circular polarization gets it right. But if we measure the photon A to be x, then the photon B must become linearly polarized in the y-direction, so that the second experimentalist gets the right result.
Once again, Einstein thought that this proved that quantum mechanics must send some "signals" that make it work, and it violated causality and special relativity, and therefore it had to be incorrect. However, today we know for sure that it was Einstein who was misled. Experiments done after Einstein died - for example those by Alain Aspect and his colleagues, and Anton Zeilinger and his colleagues - have clearly shown that all the correlations are there, exactly like quantum mechanics predicts.
There have been many other developments. The prince Louis de Broglie did not want to accept the probabilistic interpretation either, and therefore he proposed the pilot wave theory in the late 1920s. According to this theory, the particles have a well-defined position and momentum, but there also exists an objective wave associated with the particle. This wave creates a potential that affects the particle's motion in such a way that for a generic position of the particle in the wave at the beginning, the evolution will preserve the fact that the particle remains at a generic point given by the probability distribution associated with the wave, and therefore de Broglie's theory can give the same predictions as quantum mechanics in the simplest contexts, even though it is deterministic: the particle as well as the wave were the usual classical concepts, governed by some deterministic differential equations.
Such theories were called "hidden variable theories". De Broglie's theory was rediscovered in the 1950s by the communist David Bohm and it made him very famous, even though he was not the first one and even though the theory is misled. (The communists are always good in adopting things that do not belong to them.)
It was thought by the majority of physicists that the question whether the hidden variable theories are more true than the orthodox probabilistic quantum mechanics would remain a philosophical, religious question forever, and no physical experiment could ever resolve the dispute. Another advocate of these (crappy) hidden variable theories, namely John Bell, decided that it could not be the case, and he wanted to destroy orthodox quantum mechanics forever. He realized that there is a measurable difference. In the example with the two photons, we saw that quantum mechanics predicts "more correlation" related to "more types of measurements" than what our classical intuition finds natural. Bell quantified this observation, and he showed that every deterministic theory (or even a theory where the state of the photons is objectively and uniquely given already before the measurement is done) must lead to a combined correlation of various pairs of observables that always belongs to an interval. Quantum mechanics however often leads to higher (or smaller) correlations. Consequently, if you perform this experiment, the correlations should respect Bell's bounds - because the world is classical, as Bell believed - and quantum mechanics would be ruled out.
Unfortunately for Bell, the critical experiments were already done during his life. Instead of confirming his deterministic prejudices, they led to a spectacular confirmation of quantum mechanics and its very large correlations - and Bell's approach became one of the key insights showing that quantum mechanics can't really be messed with. These experiments probing entanglement, EPR phenomena, and quantum teleportation rule out all hidden variable theories whose dynamics is local and causal (that admit no faster than light signals). In other words, if you want to revive these hidden variable theories today, you must give up special relativity and the bound on the maximal speed of signals, and this will most likely lead you to contradictions with various experiments.
It may be useful to say a few more words why the hidden variable theories are crappy:
- They have serious problems to incorporate special relativity. Some "Bohmian fundamentalists" believe that they can construct Bohmian versions of quantum electrodynamics and perhaps the Standard Model, but it seems to only be their wishful thinking because the general argument above shows that the hidden variable theories that can agree with experiments don't respect causality, and therefore they will generically break special relativity
- They can't naturally explain physical notions such as the spin. In Bohmian theories, you need to assign an objective classical value to a complete system of observables. This must include a projection of the spin of every particle. But in that case, you must decide which component of the spin is allowed to have this classical value - and that will break rotational invariance. The only reason why the discreteness of the z-component of the spin in quantum mechanics does not break the rotational symmetry is that the amplitudes for the spin are probabilistic, not objective classical numbers.
- Even if we forget about these advanced subjects, the hidden variable theories go against the lessons that quantum mechanics taught us. For example, different observables on the Hilbert space (such as the position and the momentum; or spins with respect to different axes) are equally good observables - and the bases built from their eigenstates are equally good bases. It's not natural to pick the position (plus another random set of observables) and allow them classical values. Feynman's lectures in physics are good because they explain this "democracy" between different observables quite nicely.
- In reality, decoherence explains the "classical character of the position of macroscopic objects" dynamically. The fact that the Moon should be thought of as having a well-defined position follows from the Hamiltonian (and the decoherence calculations), not from a pre-determined special role of the position. Moreover, decoherence (combined with consistent histories) solves many other conceptual problems of the Copenhagen quantum mechanics (especially the emergence of the boundary between "classical" and "quantum"), and therefore the reasons to abandon quantum mechanics keep on converging to zero.
OK, let me now emphasize that quantum mechanics - for example quantum mechanics extended to the relativistic world, such as in quantum field theory - respects not only all the rules of quantum theory (the probabilistic character of the amplitudes; the possibility to entangle distant objects; the superposition principle for the wavefunction; the possibility to have higher correlations than Bell's bound), but it also respects the rules of special relativity.
It is useful to think in the Heisenberg picture. The field operators evolve according to the same equations as their classical counterparts (classical fields). The classical field is only affected by the values of this and other fields in the past light cone, and correspondingly, all correlators, expectation values (which includes the probabilities of various things, because the probability is an expectation value of a projector, and a projector is a function of the other observables) of an operator will only be affected by the observables constructed from the operators in its past light cone. The commutation relations respect the Lorentz invariance, too. This implies that no superluminal signals are possible in quantum field theory (also called "relativistic quantum field theory" or "local quantum field theory").
Even if you "feel" that something "seems" to propagate faster than light between A,B, you will never be able to use the EPR effects to inform someone who lives on the Sun about your new Nobel prize faster than in 8 minutes.
It should be repeated that the only reason why superluminal signals are not possible in QFT is that the outcomes of the experiments are probabilistic. Consider the example with two photons from the beginning. Why can't we send a signal from A to B superluminally? It's because the results of the B measurement are always 50:50 and the person A just can't affect it. If we forget about A altogether, the probabilities to get L or R for the photon B are always 50:50 percent, much like for x:y, even if the person A jumps like mad. If we think about both A and B, they can be correlated, but it's not correct to say that the outcome in B was a consequence of anything done at A. The measurements at A,B can be space-like separated, and it would be foolish to talk about a causal relation between them.
Another safety rule of quantum mechanics is that the probabilistic nature of quantum theory prevents the person A from commanding her own photon. Even A herself will get random results. If A were able to force her photon to be measured as R (or L, depending on A's thinking), and if the correlations were preserved, the person B would have to get the same result as A, and A could therefore send bits of information faster than light. But once again, it's not possible in reality because the specific results in A as well as B are unpredictable. Both A and B will know that the results of their measurements are correlated - if they measure the same type of property of their photons - but they will never be able to affect in advance what these results are, and therefore they won't be able to use these strange features of quantum theory for superfast communication.
You can see that the orthodox quantum mechanics is an ally of special relativity. They work together, but a modification of the quantum theory (such as the hidden variable theories) would spoil relativity, too.
Finally, let me say that a more complete theory underlying quantum field theory - in other words, string/M-theory - may predict some subtle violations of locality and causality. But their reach will probably be very short (the string scale or the Planck scale); it may become macroscopic in the presence of horizons. Nevertheless, there are also arguments that show that string/M-theory preserves locality and causality (and their major consequences) exactly, at least in some contexts and formalisms. See a recent paper by David Gross and Ted Erler, for example. The lessons from quantum field theory can therefore be extrapolated quite seriously even to a deeper theory underlying QFT.