## Get Custom homework writing help and achieve A+ grades!

Custom writing help for your homework, Academic Paper and Assignments from Academic writers all over the world at Tutorsonspot round the clock.

Loading...

Messages

Proposals

Custom writing help for your homework, Academic Paper and Assignments from Academic writers all over the world at Tutorsonspot round the clock.

- Custom homework writing help
- Plagiarism Free Solutions Guaranteed!
- A+ Grade Guaranteed!
- Privacy guaranteed!
- Best prices guaranteed!
- Timely delivery guaranteed!
- Hundreds of Qualified Writers 24/7

brief written statements containing main conceptual ideas from the assigned reading material in your own words, accompanied by three written questions you would like to be answered in the class

Chapter 3: Operators and observables 3.3.1 –3.3.2- 3.3.3 this is together than writing three questions

Angular momentum operator and statistical interpretation. 3.3.4, 3.3.5 this is together than writing three questions

chapter 4: 4.1.1 - 4.1.2 this is together than writing three questions

Project ID | 717182 |

Category | Business & Management |

Subject | Strategic Management |

Level | O-Level/A-Level |

Deadline | 7 Days |

Budget | $90 above ( Over 10 pages/Long Assignment) Approx. |

Required Skills | Powerpoint Presentation (PPT) |

Type | Open For Bidding |

We have over 1500 academic writers ready and waiting to help you achieve academic success

Yours all information is private and confidential; it is not shared with any other party. So, no one will know that you have taken help for your Academic paper from us.

Hi, Hope you are doing well. I can do this easily because I have several experiences to write articles on different web sites, creative content for several blogs & also SEO writing. Even I have written many kindle ebooks, Being a creative writer, I think I am the most eligible person for your Ghostwriting project. So lets make no longer delay & start chatting immediately.

Offer: $100

Australia

Hi, I am an MS Research Scholar, and after carefully reading the description of the project I can confidently say that I am a suitable candidate, equipped with right skills, to complete this valuable task of yours. I assure you timely completion, originality and grammatically correct content, according to your needs. Please feel free to contact me for completion of this task. Thank you very much.

Offer: $100

Pakistan

I feel, I am the best option for you to fulfill this project with 100% perfection. I am working in this industry since 2014 and I have served more than 1200 clients with a full amount of satisfaction.

Offer: $95

Australia

Hi! It is good to see your project and being a reputed & highest rated freelance writer on this website, you can be assured of quality work! I am here to provide you with completely non-plagiarised work

Offer: $95

India

Greetings! I am the professional electrical, telecom engineer, rich experience in QPSK, OFDM, FFT, such signal processing concetps with matlab, I can satisfy you definitely. more in chat.thanks.

Offer: $95

India

This project is my strength and I can fulfill your requirements properly within your given deadline. I always give plagiarism-free work to my clients at very competitive prices.

Offer: $75

United States of America

Hello dear, I have gone through your prompt, and I can comfortably handle it within the given time. I am competent, well informed and dedicated writer. I follow instructions, communicate effectively and revise the work given ASAP as asked. Hire me and you will get a QUALITY paper. Thank you

Offer: $100

India

Hi there. I am a proficient writer with an impeccable understanding of the English Language and scientific research formatting, including: APA, MLA, Harvard. Chicago and IEEE. I have handled over 1000 research proposals in different disciplines. I guarantee an A+ grade given a chance to work on your project. Do hire me. Regards

Offer: $100

United States of America

Hello there, I saw you have posted a project. I am interested in your project. I am a fiction writer, travel blogger, and content creator. I have been doing creative writing for 5 years and I would love to put my skills to work for you. I am passionate about the writing process and have penned a wide variety of content. From fictional stories to personal accounts, to travel articles...these just scratch the surface of what I love to do.

Offer: $100

Pakistan

I can help you with creating a presentation of one slide for The Word of William Hunter. I will be happy to offer you 100% original work with high-quality standard, professional research and writing services of various complexities.

Offer: $65

United States of America

Greetings! I’m very much interested to write for attendance systems. I am a Professional Writer with over 5 years of experience, therefore, I can easily do this job. I will also provide you with TURNITIN PLAGIARISM REPORT. You can message me to discuss the details.

Offer: $80

India

I am a Ph.D. writer with more than 9 years of working experience in Writing. I have successfully completed more than 4500 projects for my clients with their full amount of satisfaction. I will provide you super quality work according to your given requirements and deadline with ZERO plagiarism.

Offer: $80

Lev I. Deych

Advanced Undergraduate Quantum Mechanics Methods and Applications

Advanced Undergraduate Quantum Mechanics

Lev I. Deych

Advanced Undergraduate Quantum Mechanics Methods and Applications

123

Lev I. Deych Physics Department Queens College and the Graduate Center City University of New York New York, NY, USA

ISBN 978-3-319-71549-0 ISBN 978-3-319-71550-6 (eBook) https://doi.org/10.1007/978-3-319-71550-6

Library of Congress Control Number: 2017961762

© Springer International Publishing AG, part of Springer Nature 2018 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Printed on acid-free paper

This Springer imprint is published by the registered company Springer International Publishing AG part of Springer Nature. The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

To Yelena, Daniil, Sergei, and my dog Xena

Preface

The market for undergraduate textbooks on quantum mechanics is oversaturated as one can literally choose from tens, if not hundreds, of different titles, with several of them being accepted as a standard. Writing yet another textbook on this subject might appear as a reckless and time-consuming adventure, and whoever got engaged in it can be rightfully suspected in arrogance and self-aggrandizement. Be that as it may, after 15 years of teaching an undergraduate quantum mechanics course at Queens College of the City University of New York, I finally came to realization that neither I nor my students are particularly happy with the existing standards. It also occurred to me that my approaches to teaching various topics in quantum mechanics, along with the lecture notes I have accumulated over these years, could form the foundation of a textbook that would be different from those I saw on the market. In many cases, students treat their physics textbooks as a reference source for formulas and postulates—used to solve problems rather than as actual reading material. I, on the other hand, dreamed about writing a book that would actually be read; and in order to achieve that, I have devoted a significant amount of time to the explanation of the most minute technical details of various technical derivations. The level of technical detail in the book would ideally allow students to use it without the need for a lecturer’s explanations, allowing professors to use precious lecture time on something more productive and fun. But quantum mechanics is not just about derivations of formulas (though these might be fun, of course, for the mathematically inclined). The physical content of the derived results is immensely more important and interesting, if you ask me. Thus, I have included in the text extensive qualitative discussions of the physical significance of derived formulas and equations. Finally, to make reading even more enjoyable, I tried to preserve the informal, colloquial style of my lectures, addressing readers directly and avoiding the dry, impersonal manner found in too many formal scientific texts.

Most physics textbooks present ideas and concepts without more than a passing mention of the people who discovered them. Such an approach aims to emphasize the objectivity of the laws and principles that physics deals with. The essence of these laws does not depend on the personal traits and ideologies of their discoverers, which is not always the case in humanities. While I agree that such a cold, formal

ix

x Preface

approach is justified by the objective nature of the laws of physics, I am still not sure that it is the best way to present the material to students. This approach dehumanizes physics, preventing people from relating to it on a personal level and seeing physics as part of the general human experience. In this book, I tried to break out of this tradition and introduced some personal details about the lives of those scientists who were responsible for developing quantum theory and changing our views of the universe along the way. I would like the readers to see that the complex technical and philosophical ideas in quantum mechanics were generated by mortal human beings with strengths and weaknesses—that they experienced struggles and made mistakes for which they had to bear full responsibility. Obviously, it would be impossible to talk about all these great (and sometimes not so great) men1 in detail. But whenever possible, I have tried to provide bits and pieces about the personal lives of scientists whose names appear in the text.

Anyone writing a textbook on such an immense subject as quantum mechanics always struggles with the question of which topics to include and which to leave out. There are, of course, some standard concepts that cannot be avoided, but beyond those, the choice is always a function of the author’s personal predilections. These predilections led me to include some topics that are not usually covered in under- graduate textbooks, such as the Heisenberg equations, the transfer-matrix method for one-dimensional problems, optical transitions in semiconductors, Landau levels, and the Hall conductivity. At the same time, I left out such popular topics as WKB and variational methods. The total contents of the book were chosen to satisfy the needs of a two-semester course for students who have already been exposed to some quantum ideas in a modern physics course or its equivalent. At the same time, the book can also be used for a one-semester course. While each instructor deciding to adopt this book can choose whatever material they prefer for their one- semester course, my suggestion would be to include Chaps. 1 through 9, as well as Chaps. 11 and 15, which can be taught almost independently of other chapters of the book. (While there are some cross-references between these and other chapters, they shouldn’t impede the student’s or instructor’s ability to get through the arguments.) Finally, I would like to emphasize that the problems offered for solution by the students are an integral part of the text and must be treated as such.

Quantum mechanics is one of the ultimate triumphs of the human mind. I enjoyed writing this book, trying to convey the awesomeness of quantum ideas and of the people who contributed to their development, and I hope that you will enjoy reading

1Needless to say, at the time of birth of quantum physics, most of the scientists were, indeed, men. Even more remarkable are achievements of a few women who left their marks on the twentieth- century physics. Nobel Prize winners Marie and Irene Curie have since become household names, but other female scientists such as German mathematician Emmy Noether and Austrian- Swedish experimental physicist Lise Meitner also deserve to be remembered. Unfortunately, the contributions of Noether, who uncovered the intimate relationship between symmetries and conservation laws, and Meitner, who together with Otto Hahn discovered the fusion of uranium, are too advanced for an introductory quantum mechanics book, preventing me from talking about these scientists in more detail.

Preface xi

it. While physics students, unlike their brethren from humanity departments, are not accustomed to actually reading their assigned texts, I would like to encourage you to overcome the established habits and give this book a chance. Who knows, you might like it.

New York, NY, USA Lev I. Deych

Acknowledgments

I would like to thank Drs. Mishaktul Bhattacharya and Alexey Tsvelik for reading and commenting on the portions of the manuscript and for general encouragement and Prof. Nathaniel Knight for making sure that my historical introduction did not deviate too much from actual historical facts. My special gratitude, however, goes to Prof. Lev Murokh of Queens College for test-driving the book in a quantum mechanics class he was teaching in fall 2017 and for discovering multiple misprints and quite a few errors, thereby saving me from a great deal of embarrassment. However, I take full responsibility for whatever mistakes might still remain in the text and will be immensely grateful to anyone pointing them out to me.

xiii

Contents

Part I Language and Formalism of Quantum Mechanics

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1 The Rise of Quantum Physics and Its Many Oddities . . . . . . . . . . . . . . 3 1.2 Brief Overview of the Historical Background . . . . . . . . . . . . . . . . . . . . . . 10

2 Quantum States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.1 Classical and Quantum States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2 Quantum States and Hilbert Vector Space . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.2.1 Superposition Principle. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2.2 Linear Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.2.3 Superposition Principle and Probabilities . . . . . . . . . . . . . . . . . 27

2.3 States Characterized by Observables with Continuous Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3 Observables and Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.1 Hamiltonian Formulation of Classical Mechanics . . . . . . . . . . . . . . . . . . 41 3.2 Operators in Quantum Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.2.1 General Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.2.2 Commutators, Functions of Operators,

and Operator Identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.2.3 Eigenvalues and Eigenvectors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

3.3 Operators and Observables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.3.1 Hermitian Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.3.2 Quantization Postulate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 3.3.3 Constructing the Observables: A Few Important

Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 3.3.4 Eigenvalues of the Angular Momentum . . . . . . . . . . . . . . . . . . . 73 3.3.5 Statistical Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

3.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

xv

xvi Contents

4 Unitary Operators and Quantum Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 4.1 Schrödinger Picture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

4.1.1 Time-Evolution Operator and Schrödinger Equation . . . . . 96 4.1.2 Stationary States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 4.1.3 Ehrenfest Theorem and Correspondence Principle . . . . . . . 102

4.2 Heisenberg Picture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 4.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

5 Representations of Vectors and Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 5.1 Representation in Continuous Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

5.1.1 Position and Momentum Operators in a Continuous Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

5.1.2 Parity Operator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 5.1.3 Schrödinger Equation in the Position

Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 5.1.4 Orbital Angular Momentum in Position

Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 5.1.4.1 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 5.1.4.2 Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

5.2 Representations in Discrete Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 5.2.1 Discrete Representation from a Continuous One . . . . . . . . . 147 5.2.2 Transition from One Discrete Basis to Another . . . . . . . . . . . 151 5.2.3 Spin Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

5.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

Part II Quantum Models

6 One-Dimensional Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 6.1 Free Particle and the Wave Packets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 6.2 Rectangular Potential Wells and Barriers . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

6.2.1 Potential Wells: Systems with Mixed Spectrum . . . . . . . . . . 177 6.2.2 Square Potential Barrier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

6.3 Delta-Functional Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 6.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198

7 Harmonic Oscillator Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 7.1 One-Dimensional Harmonic Oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

7.1.1 Stationary States (Eigenvalues and Eigenvectors) . . . . . . . . 203 7.1.2 Dynamics of Quantum Harmonic Oscillator . . . . . . . . . . . . . . 217

7.2 Isotropic Three-Dimensional Harmonic Oscillator . . . . . . . . . . . . . . . . . 228 7.2.1 Isotropic Oscillator in Spherical Coordinates . . . . . . . . . . . . . 229

7.3 Quantization of Electromagnetic Field and Harmonic Oscillators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238 7.3.1 Electromagnetic Field as a Harmonic Oscillator . . . . . . . . . . 238 7.3.2 Coherent States of the Electromagnetic Field . . . . . . . . . . . . . 245

7.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249

Contents xvii

8 Hydrogen Atom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 8.1 Transition to a One-Body Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 8.2 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 8.3 Virial and Feynman–Hellmann Theorems and Expectation

Values of the Radial Coordinate in a Hydrogen Atom . . . . . . . . . . . . . 267 8.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271

9 Spin 1/2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 9.1 Introduction: Why Spin? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 9.2 Spin 1/2 Operators and Spinors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278 9.3 Dynamic of Spin in a Uniform Magnetic Field . . . . . . . . . . . . . . . . . . . . . 289

9.3.1 Schrödinger Picture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 9.3.2 Heisenberg Picture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292

9.4 Spin of a Two-Electron System. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294 9.4.1 Space of Two-Particle States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294 9.4.2 Operator of the Total Spin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297

9.5 Operator of Total Angular Momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303 9.5.1 Combining Orbital and Spin Degrees of Freedom . . . . . . . . 303 9.5.2 Total Angular Momentum: Eigenvalues

and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311 9.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321

10 Two-Level System in a Periodic External Field . . . . . . . . . . . . . . . . . . . . . . . . . 329 10.1 Two-Level System with a Time-Independent Interaction:

Avoided Level Crossing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330 10.2 Two-Level System in a Harmonic Electric Field: Rabi

Oscillations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 10.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343

11 Non-interacting Many-Particle Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345 11.1 Identical Particles in the Quantum World: Bosons

and Fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345 11.2 Constructing a Basis in a Many-Fermion Space . . . . . . . . . . . . . . . . . . . . 352 11.3 Pauli Principle and Periodic Table of Elements: Electronic

Structure of Atoms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362 11.4 Exchange Energy and Other Exchange Effects . . . . . . . . . . . . . . . . . . . . . 369

11.4.1 Exchange Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369 11.4.2 Exchange Correlations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375

11.5 Fermi Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378 11.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385

Part III Quantum Phenomena and Methods

12 Resonant Tunneling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391 12.1 Transfer-Matrix Approach in One-Dimensional Quantum

Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391 12.1.1 Transfer Matrix: General Formulation. . . . . . . . . . . . . . . . . . . . . 391

xviii Contents

12.1.2 Application of Transfer-Matrix Formalism to Generic Scattering and Bound State Problems . . . . . . . . . . . . 399 12.1.2.1 Generic Scattering Problem via the

Transfer Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399 12.1.2.2 Finding Bound States with the Transfer

Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402 12.1.3 Application of the Transfer Matrix to a

Symmetrical Potential Well . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403 12.1.3.1 Scattering States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404 12.1.3.2 Bound States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407

12.2 Resonant Tunneling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411 12.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425

13 Perturbation Theory for Stationary States: Stark Effect and Polarizability of Atoms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429 13.1 Non-degenerate Perturbation Theory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430

13.1.1 Quadratic Stark Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434 13.1.2 Atom’s Polarizability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440

13.2 Degenerate Perturbation Theory and Applications . . . . . . . . . . . . . . . . 444 13.2.1 General Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444 13.2.2 Linear Stark Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455

13.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461

14 Fine Structure of the Hydrogen Spectra and Zeeman Effect . . . . . . . . . 465 14.1 Spin–Orbit Interaction and Fine Structure of the Energy

Spectrum of Hydrogen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465 14.1.1 Spin–Orbit Contribution to the Hamiltonian . . . . . . . . . . . . . . 465 14.1.2 Schrödinger Equation with Spin–Orbit Term . . . . . . . . . . . . . 469 14.1.3 Fine Structure of the Hydrogen Spectrum . . . . . . . . . . . . . . . . . 476

14.2 Zeeman Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482 14.2.1 Zeeman Effect in the Weak Magnetic Field . . . . . . . . . . . . . . . 484 14.2.2 Strong Magnetic Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486 14.2.3 Intermediate Magnetic Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 488

14.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495

15 Emission and Absorption of Light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497 15.1 Time-Dependent Perturbation Theory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497 15.2 Fermi’s Golden Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506

15.2.1 First-Order Transmission Probability in the Case of a Monochromatic Perturbation . . . . . . . . . . . . . . . . . . . . . . . . . . 506

15.2.2 A Non-monochromatic Noncoherent Perturbation. . . . . . . . 512 15.2.3 Transitions to a Continuum Spectrum . . . . . . . . . . . . . . . . . . . . . 515

15.3 Semiclassical Theory of Absorption and Emission of Light . . . . . . . 524 15.3.1 Interaction of Charged Particles with

Electromagnetic Radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524

Contents xix

15.3.2 Absorption and Emission of Incoherent Radiation by Atoms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531 15.3.2.1 Absorption and Stimulated Emission . . . . . . . . . . 531 15.3.2.2 Spontaneous Emission Without

Quantum Electrodynamics: Einstein Meets Planck . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 540

15.3.3 Optical Transitions in Semiconductors . . . . . . . . . . . . . . . . . . . . 544 15.3.4 Selection Rules: Dipole Matrix Elements Made

Easier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555 15.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561

16 Free Electrons in Uniform Magnetic Field: Landau Levels and Quantum Hall Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569 16.1 Classical Mechanics of a Charged Particle in Crossed

Uniform Electric and Magnetic Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 572 16.1.1 Cyclotron Motion in the Uniform Magnetic Field . . . . . . . . 572 16.1.2 Classical Motion in Crossed Electric and Magnetic

Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574 16.2 Quantum Theory of Electron’s Motion in a Uniform

Magnetic Field. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577 16.2.1 Landau Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577 16.2.2 Degeneracy of the Landau Levels and the Density

of States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 586 16.2.3 Fermi Energy of a Gas of Non-interacting

Electrons in Magnetic Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589 16.3 Quantum Hall Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594 16.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605

Part I Language and Formalism

of Quantum Mechanics

Chapter 1 Introduction

1.1 The Rise of Quantum Physics and Its Many Oddities

In 1890, Scots-Irish physicist Lord Kelvin (born William Thomson, knighted in 1866 by Queen Victoria for his work on transatlantic telegraph, ennobled in 1892 for his achievements in thermodynamics, and became the first British scientist elevated to the House of Lords of the British Parliament) gave his famous speech identifying only two “clouds” in the clear sky of classical physics. One of them was the problem of luminous ether undetected in the series of experiments carried out between 1881 and 1887 by Americans Albert Michelson and Edward Morley, and the second one was the problem of the black-body radiation. Classical physics predicted that the amount of electromagnetic energy emitted by warm bodies increases with a decreased wavelength of radiation, making the total emitted energy infinite. This problem was dubbed an ultraviolet catastrophe by Paul Ehrenfest in 1911. As it turned out, both these little clouds spelled the end of the classical physics: the first of them resulted in the special relativity theory, and the solution of the second one achieved by German Physicist Max Planck laid the first stone in the foundation of quantum physics.

To explain the radiation of the black bodies, Planck had to introduce unusual entities—oscillators whose energy cannot be changed continuously, but must be a multiple of an elementary energy quantum h�, where h D 6:55 � 10�34 J s is a fundamental constant of nature introduced by Planck and � is the classical frequency of the oscillator. (This expression is often written as „!, where „ (h-bar) is “the reduced” Planck’s constant and ! is the angular frequency of the oscillator. Given that ! D 2��, you can easily find that „ D h=2� .) Using this assumption and the apparatus of classical statistical physics, Planck has derived his famous formula for the spectral energy density of the black-body radiation:

© Springer International Publishing AG, part of Springer Nature 2018 L.I. Deych, Advanced Undergraduate Quantum Mechanics, https://doi.org/10.1007/978-3-319-71550-6_1

3

4 1 Introduction

u.�;T/ D 8�h� 3

c3 1

exp �

h� kBT

� � 1

; (1.1)

where u.�;T/d� is the energy density of electromagnetic waves with frequencies within the interval Œ�; � C d��, kB is the Boltzmann constant, and T is the absolute temperature.1 The original value of the constant h which Planck derived by fitting his formula to the experimental data turned out to be quite close to the modern value, which is currently believed to be

h D 6:62607004 � 10�34 J s:

The idea that the energy of a classical particle cannot take an arbitrary continuously changing value seemed to Planck quite revolutionary, so much so that he refused to believe that his quantized oscillators are related to any real physical objects, such as atoms or molecules, thinking of them as of a purely mathematical trick, which works. It took Einstein’s theory of photoelectric effect (1905), where Einstein explicitly postulated that electromagnetic energy propagates in the form of quantized and indivisible portions, his 1907 theory of specific heat in solids, and other developments, for Planck to finally declare in 1911 that the hypothesis of the quanta reflects physical reality and marks the beginning of a new era in physics.

From this point on, quantum ideas started rolling down the unsuspected physi- cists like an avalanche burying underneath the entire classical world of objective reality and certainty. Or so it seemed. Einstein opened the can of worms by proposing the notion of light quanta, which introduced into the conscience of physicists the idea of the wave–particle duality. In 1923 this idea was picked up by a young French Ph.D. student Louis de Broglie who in his dissertation suggested that not only light, normally thought to be a wave, can behave as a stream of particles, but also regular particles—electrons, neutrons, atoms, etc.— can manifest wavelike properties. He postulated that Einstein’s relations connecting the wave characteristics of light (frequency, �, and wavelength, �) with its particle’s characteristics (energy E and momentum p)

E D h� (1.2) p D h=� (1.3)

1The actual story of this formula and Planck’s contribution to quantum physics is not quite that simple. First, the ultraviolet catastrophe apparently did not motivate Planck at all as he did not think that it was an unavoidable logical consequence of classical physics. Second, Planck first obtained his formula empirically by trying to fit the experimental data and only after that found a theoretical “explanation” for it. Third, he did not believe that his quantized oscillators represent real atoms or molecules for quite some time and accepted the reality of energy quantization only very reluctantly. This story is described in more detail in the article by H. Kragh “Max Planck: the reluctant revolutionary” published in December 2000 in Physics World.

1.1 The Rise of Quantum Physics and Its Many Oddities 5

can be reversed and applied to electrons, protons, and other material particles. A significant difference between light and electrons, of course, is that the latter have a finite mass and obey a quadratic relation between energy and momentum, E D p2=2me, where me is the mass of the particle, while the former is characterized by the linear relation E D pc, where c is the speed of light, following from the relativity theory for particles with zero rest mass. This difference as you will see later plays a crucial role in quantum theory.

The revolutionary matter wave idea of de Broglie was confirmed experimentally short 4 years later, in 1927, by American physicists Clinton Davisson and Lester Germer working at Bell Labs and independently by British scientist George Paget Thomson at the University of Aberdeen. Davisson and Germer observed diffraction of electrons propagating through crystalline nickel, while Thomson studied electrons propagating through a metal film.2 These achievements resulted in the 1929 Nobel Prize for de Broglie and the shared 1937 Nobel prize for Davisson and Thomson. So, if you thought that the quantization of energy was a revolutionary idea, then the wave–particle dualism must really blow your mind: how can something be simultaneously a particle (localized, indivisible, countable entity) and a wave (extended, continuous, arbitrarily divisible excitation of a medium)? To wrap his mind around this oddity of the quantum world, Danish physicist Niels Bohr came up with his famed complementarity principle, which essentially states that all experiments that one can conduct with objects of atomic scale can be divided into two never-overlapping groups. In the experiments belonging to the first group, the material objects reveal their particle-like side, and in the experiments of the second group, they present to the world their wavelike quality, and it is impossible to design an experiment, in which both sides are manifested together. In the book Evolution of Physics by L. Infeld and A. Einstein, this principle was presented in the following way:

But what is light really? Is it a wave or a shower of photons? There seems no likelihood for forming a consistent description of the phenomena of light by a choice of only one of the two languages. It seems as though we must use sometimes the one theory and sometimes the other, while at times we may use either. We are faced with a new kind of difficulty. We have two contradictory pictures of reality; separately neither of them fully explains the phenomena of light, but together they do.

Even though this quote refers to the properties of light, it can be repeated almost verbatim for the quantum description of any atomic object. Bohr’s own formulation of the complementarity as presented in his famous 1927 lecture at a conference in the Italian town of Como is somewhat deeper even if more vague when taken out of the broader context:

any given application of classical concepts precludes the simultaneous use of other classical concepts which in a different connection are equally necessary for the elucidation of the phenomena.

2Here is a historical irony for you: George Thomson, who got the Nobel Prize for proving that electrons are waves, is a son of J.J. Thomson (not to be confused with W. Thomson—Lord Kelvin), a prominent English physicist, who got the Nobel Prize for proving that an electron is a particle.

6 1 Introduction

What Bohr is implying here is that by virtue of the macroscopic size of an observer (that would be us, humans), any measurement is necessarily conducted by a large macroscopic apparatus, which is supposed to be fully describable by the laws of classical physics. The results of the measurements, therefore, are interpreted in terms of classical concepts such as momentum, position, energy, time, etc. Then the complementarity principle poses that one cannot design a single experiment in which classical quantities belonging to complementary classes such as momentum and position, energy and time, etc. can be determined. Thus measuring a position of a quantum object, i.e., trying to localize it at a point in space and time, reveals this object’s particle-like characteristics simultaneously destroying any traces of its wavelike behavior. The most frequently discussed illustration of this idea is a double-slit interference experiment beautifully analyzed, for instance, in the famous The Feynman Lectures on Physics, which are now freely available online at http://www.feynmanlectures.caltech.edu/, and I encourage you to go ahead and peruse them. The scale disparity between observers and atomic objects is what makes quantum theory so difficult to “understand”—our vocabulary developed to reflect the macroscopic world of our own scale fails when we try to apply it to the world of atoms and electrons. This is how Richard Feynman describes it:

Because atomic behavior is so unlike ordinary experience, it is very difficult to get used to, and it appears peculiar and mysterious to everyone—both to the novice and to the experienced physicist. Even the experts do not understand it the way they would like to, and it is perfectly reasonable that they should not, because all of direct, human experience and of human intuition applies to large objects. We know how large objects will act, but things on a small scale just do not act that way. So we have to learn about them in a sort of abstract or imaginative fashion and not by connection with our direct experience.3

In 1927 Heisenberg found himself locked in the battle of ideas with Austrian physicist Erwin Schrödinger, the author of the famed Schrödinger equation, who took the de Broglie matter wave idea close to his heart and introduced a wave function satisfying a simple differential equation, which he tried to interpret quite literally as a quantity representing an actual electron smeared over some small but finite region of space. It differed strongly from Heisenberg’s approach based upon the idea of “quantum jumps” introduced in Bohr’s model of a hydrogen atom (I am sure you heard about Bohr’s quantization postulates and his planetary model of an atom so I do not have to reproduce it here), which he described using abstract algebraic quantities later found by Heisenberg’s collaborators and compatriots Ernst Pascual Jordan and Max Born to be matrices. Schrödinger’s interpretation of the matter waves was very appealing, especially to older physicists, because it preserved most of the classical world view—continuous evolution of electron’s charge density in space–time with no incomprehensible discontinuities introduced by quantum jumps. You might be amused by the following exchange between Schrödinger and Bohr that took place during Schrödinger’s visit to Copenhagen in October of 1926 4:

3R. Feynman, R. Leighton, M. Sands, Feynman Lectures on Physics, vol. 1, Ch. 37, online edition available at http://www.feynmanlectures.caltech.edu/. 4Quoted from the book by W. Moore, Schrödinger. Life and Thought (Cambridge University Press, Cambridge, 1989).

1.1 The Rise of Quantum Physics and Its Many Oddities 7

Schrödinger: “You surely must understand, Bohr, that the whole idea of quantum jumps necessarily leads to nonsense. . . . the electron jumps from this orbit to another one and thereby radiates. Does this transition occur gradually or suddenly? If it occurs gradually, then the electron must gradually change its rotation frequency and energy. It is not comprehensible how this can give sharp frequencies for spectral lines. If the transition occurs suddenly, in a jump so to speak, . . . one must ask how the electron moves in a jump. Why doesn’t it emit a continuous spectrum? And what laws determine its motion in this jump?”

Bohr: “Yes, in what you say you are completely right. But that doesn’t prove that there are no quantum jumps. It only proves that we can’t visualize them, that means that the pictorial concepts we use to describe the events of everyday life and the experiments of the old physics do not suffice also to represent the process of a quantum jump. That is not surprising when one considers that the processes with which we are concerned here cannot be the subject of direct experience . . . and our concepts do not apply to them”.

Schrödinger: “I do not want to get into a philosophical discussion with you about the formation of concepts . . . but I should simply like to know what happens in an atom. It’s all the same to me in what language you talk about it. If there are electrons in atoms, which are particles, as we have so far supposed, they must also move about in some way. At the moment, it’s not important to me to describe this motion exactly; but it must at least be possible to bring out how they behave in a stationary state or in a transition from one state to another. But one sees from the mathematical formalism of wave or quantum mechanics that it gives no rational answer to this questions. As soon, however, as we are ready to change the picture, so as to say that there are no electrons as particles but rather electron waves or matter waves, everything looks different. We no longer wonder about the sharp frequencies. The radiation of light becomes as easy to understand as the emission of radio waves by an antenna. . . ”

Bohr: “No, unfortunately, that is not true. The contradictions do not disappear, they are simply shifted to another place. . . Think of the Planck radiation law. For the derivation of this law, it is essential that the energy of the atom have discrete values and change discontinuously . . . You can’t seriously wish to question the entire foundation of quantum theory.”

Schrödinger: “Naturally, I do not maintain that all these relations are already completely understood . . . but I think that the application of thermodynamics to the theory of matter waves may eventually lead to a good explanation of Planck’s formula”

Bohr: “No, one cannot hope for that. For we have known for 25 years what the Planck formula means. And also we see the discontinuities, the jumps, quite directly in atomic phenomena, perhaps on the scintillation screen or in a cloud chamber . . . You can’t simply wave away these discontinuous phenomena as though they didn’t exist.”

Schrödinger: “If we are still going to have to put up with these damn quantum jumps, I am sorry that I ever had anything to do with quantum theory.”

Bohr: “But the rest of us are very thankful for it—that you have—and your wave mechanics in its mathematical clarity and simplicity is a gigantic progress over the previous form of quantum mechanics.”

By the end of this discussion, Schrödinger fell ill and for a few days had to stay as a guest in Bohr’s house where Bohr’s wife, Margrethe, was taking care of him. In the same year Schrödinger wrote:5

My theory was inspired by L de Broglie . . . and by short but incomplete remarks by A Einstein. . . . No genetic relation whatever with Heisenberg is known to me. I knew of his theory, of course, but felt discouraged not to say repelled, by the methods of transcendental algebra, which appeared very difficult to me and by the lack of visualizability.

5E. Schrödinger, On the relationship between the Heisenberg-Born-Jordan quantum mechanics and mine. Ann. Phys. 70, 734 (1926).

8 1 Introduction

Heisenberg fully understood the weakness of the matrix theory, writing in his 1925 paper, coauthored with Born and Jordan6:

Admittedly, such a system of quantum-theoretical relations between observable quantities . . . would labor under the disadvantage that there can be no directly intuitive geometrical interpretation because the motion of electrons cannot be described in terms of the familiar concepts of space and time.

The popularity of Schrödinger’s matter waves not only spelled for him purely scientific troubles as a competing theory but also jeopardized his career: he was ripe to search for a permanent university position, and fondness of professors in charge of the academic appointments for Schrödinger’s views created problems for Heisenberg. Thus, in 1927, Heisenberg undertook concentrated efforts to remedy the perceived weaknesses of his approach to quantum theory, which resulted in the celebrated Uncertainty Paper,7 where his now famous uncertainty principle

4x4p � „ 2

(1.4)

was shown to be an unavoidable consequence of the inner structure of his quantum formalism (4x and 4p are loosely defined “uncertainties” of the position and momentum). An equally important portion of the paper was devoted to expounding the connections between the formalism and the real world. Heisenberg made a point to emphasize that in the end the goal of the theory is to predict values of those quantities, which can be observed by a clearly defined physical process. Considering operational and physical details of an observation of the coordinate, Heisenberg showed that it is impossible to carry out such an observation without disturbing the velocity of the electron, and therefore, such concepts as trajectory or continuous time dependence of a particle’s coordinate shall have no place in the theory of the electron’s properties. He illustrated these ideas with a thought experiment involving observation of an electron using a gamma-ray microscope. The idea of the experiment is that in order to actually observe (“see”) a position of the electron, one has to shine light on it and detect the reflected (or scattered if you prefer) rays using a microscope. The accuracy in measuring the position is limited by the microscope’s resolution (electron can be anywhere within a region resolved by the microscope), which is proportional to the wavelength of light: the shorter the wavelength, the smaller region of space it is able to resolve. Thus to determine the precise position of the electron, one must use light of a very short wavelength (hence, the gamma- ray microscope in Heisenberg’s paper). The problem is, however, that light of very short wavelength has very large momentum (see Eq. 1.3), which it transfers to the electron changing its momentum by the amount which is inversely proportional to the wavelength. The precise value of the electron’s new momentum and its direction

6M. Born, W. Heisenberg, P. Jordan, Zeitschr. Phys. 35, 557 (1926). 7W. Heisenberg, The physical content of quantum kinematics and mechanics. Zeitschr. Phys. 43, 172 (1927).

1.1 The Rise of Quantum Physics and Its Many Oddities 9

are completely unknown, and, therefore, trying to improve our knowledge about electron’s position, we destroy our ability to know its momentum.

Heisenberg’s uncertainty principle and the idea that it is the process of mea- surement that defines what can be known about the system contributed to Bohr’s thinking about his complementarity principle. Taken together with the uncertainty principle and Born’s interpretation of the results of quantum measurements in terms of probabilities,8 it eventually resulted in the complete mathematical and conceptual framework of quantum theory known as the Copenhagen interpretation. According to this interpretation, the observable quantities describing a quantum system do not have any definite value before they are actually measured and can randomly take one of the allowed values only after a measurement is performed. The act of measurement abruptly changes the system bringing it in the state corresponding to the realized value of the measured quantity.

Not everyone was happy with the Copenhagen interpretation, including one of the originators of the quantum revolution, Albert Einstein. He couldn’t reconcile himself with the loss of the classical determinism—the idea that the true laws of nature must provide us with complete and precise information about all essential characteristics of material objects and that given the right tools, we can always experimentally measure them. Einstein wrote to Max Born in 19269:

Quantum mechanics is certainly imposing. But an inner voice tells me that it is not yet the real thing. The theory says a lot, but does not really bring us any closer to the secret of the “old one.” I, at any rate, am convinced that He does not throw dice.

“He” in this context is a direct reference to the Creator, who appears explicitly in a catchier and better known expression of the same idea recorded in the book Einstein and the Poet by William Hermann: “God does not play dice with the world.” Leaving aside the issue of religious beliefs of Einstein, you need to understand why this particular feature of the quantum theory bothered him that much. After all, Einstein had no objections to using probability in classical statistical physics and successfully used probabilistic concepts himself in his work on Brownian motion. The difference of course comes from the fact that the probabilities in classical statistical physics reflect the objective lack of information about positions and velocities of individual molecules, and in the absence of this information, the language of the distribution functions and probabilities is the best way to deal with this situation. In the Copenhagen interpretation of quantum mechanics, the statistical uncertainty was made inherent to the nature of the universe, and Einstein had difficulty with this idea. For many years, Einstein and Bohr participated in public discussions and exchanged letters, in which Einstein tried to find a counter- example to the Copenhagen interpretation and Bohr debunked all of them. These exchanges between Einstein and Bohr are the finest example of the scientific oral and epistolary debates.

8M. Born, On the quantum mechanics of collisions. Zeitschr. Phys. 37, 863 (1926). 9Letter to Max Born (4 December 1926); The Born-Einstein Letters (translated by Irene Born) (Walker and Company, New York, 1971).

10 1 Introduction

In this book, I will rely on the traditional Copenhagen interpretation of quantum theory and will present it taking the “bull by the horns”: in spirit of Heisenberg’s views that only those quantities which can be really observed in a particular set of experiments shall be used to describe quantum systems, I will start with the abstract concept of the quantum state defined by a set of such observable quantities. Then I will place this concept within a mathematical structure of a linear vector space and will proceed from there. The Schrödinger wave function will appear much later in the text (as compared to most other undergraduate-level textbooks) as one of many alternative and equivalent ways of describing the states of quantum systems.

However, before plunging into the depth of the quantum formalism and its application, I would like to provide you with the historical background on the times when quantum theory was born. While it will not help you understand the physics better, it will help you see that advances in physics were not an isolated incident but a part of a general trend in the movement of humanity toward modernity. You will also be able to appreciate that people responsible for the birth of quantum theory are not just abstract great names—they are real people with different views on life, different political preferences, and moral principles, people forced to make difficult choices outside of their professional lives and bearing responsibility for their choices.

1.2 Brief Overview of the Historical Background

The history of quantum theory from earlier childhood to maturity covers the time period between 1900 and 1930 and involves such countries as Germany, Austria, France, Italy, Denmark, the UK, and the USA. During this time, not only the facade of classical physics crumbled, but the entire world changed drastically. It was the time of great upheavals and great discoveries, unendurable misery, and unbeatable achievements in science, architecture, arts, music, and literature. The period between 1900 and 1914 was the time of inter-European and even transcontinental integration (or what we would call now globalization) with free flow of goods, people, and capital across the borders and the time of liberalization and democratization, when even monarchic regimes of Germany, Austria-Hungary, and Russia provided more political rights and more freedom to their citizens. The liberalization and the growth of wealth that accompanied it were beneficial for developments in science, arts, music, and literature. Physics was not an exception— the free travel between countries and free exchange of ideas created fertile ground for new developments, which included relativity theory and quantum mechanics.

At the same time, the wealth and prosperity weren’t shared by all, and the socialist ideas started attracting a significant number of followers. The Austro- Hungarian Empire proved to be the less stable and vulnerable to the rising nationalism of smaller nations comprising it. On the surface, the everyday life of people in European countries appeared to follow the normal “business like usual” path; underneath of this placidity the tensions were rising, especially in the Balkans. This was the calm before the storm as everything came crushing down on June

1.2 Brief Overview of the Historical Background 11

28, 1914, with the assassination of the heir to Austro-Hungarian throne, Archduke Franz Ferdinand, at the hands of a Serbian teenage nationalist. On July 28 Austria declared war on Serbia, and on August 1 and 3, Germany declared war on Russia and France correspondingly and invaded Belgium. On August 4 Britain declared war on Germany and World War I was in full swing.

In response to the war, the nationalist and militarist fever was rising even among intellectuals: scientists, artists, and poets. Such prominent German physi- cists as Nobel Prize winners Max Planck, Philipp Lenard (photoelectric effect), Walther Nernst (third law of thermodynamics), Wilhelm Roentgen (X-rays), and Wilhelm Wien (black-body radiation) signed the infamous Manifesto of 93 declaring unequivocal support of the German occupation of Belgium, the actions known in history as the Rape of Belgium. The war devastated Europe: 18 million died (11 million military personnel and 7 million civilians), and 23 million were wounded. This war witnessed the first widespread use of chemical weapons on the battle field, most effectively by the German Army but also by the Austrians, French, and British. The chemical terms chlorine and mustard gas became household names.

During the war, not surprisingly, there was a lull in the development of physics in general and quantum theory in particular as many scientists on all sides participated in the war efforts. In addition to direct deaths, suffering, and destruction, the war started a chain of events that led to the even greater horrors of World War II. The Austrio-Hungarian Empire disintegrated, leaving in its wake a multitude of smaller countries in Central and Eastern Europe (Austria, Hungary, Yugoslavia, Czechoslovakia, Poland); the Russian Empire went through the bloody “Bolshevik” revolution, which gave birth to a cruel totalitarian regime under the guise of the “Dictatorship of Proletariat.” Germany lost the war and was forced to sign the harsh Versailles peace treaty. By the terms of the treaty, Germany and her allies took all responsibility for all the losses and damages during the war, had to pay heavy reparations, and lost significant territory, especially Alsace-Lorraine, and output of coalmines of Saar to France, Upper Silesia, Eastern Pomerania, and part of Eastern Prussia to Poland. The treaty left Germany humiliated and fostered feelings of resentment and a desire for revenge among large segments of the German population.

The German Empire disintegrated and was replaced by the weak and ineffective Weimar Republic, which survived until 1933, when Hitler, newly appointed as Chancellor, suspended democratic procedures and declared himself Fuhrer of the Third German Reich. In Germany and to a lesser extent in Austria, the period of Weimar Republic was the time of high unemployment, hyperinflation, and intense and often violent political struggles between socialists, liberal democrats, communists, conservatives, and national socialists. But it was also the time of unprecedented explosion of creativity in science, architecture, literature, film, music, and arts. Berlin became a thriving cosmopolitan city, the center of attraction for artistic and scientific elites. In addition to quantum theory, this time produced significant advancements in nuclear physics and radioactivity, and engineering, but also gave rise to eugenics and radical racial theories.

12 1 Introduction

Fig. 1.1 From back to front and from left to right: Auguste Piccard, Émile Henriot, Paul Ehrenfest, Édouard Herzen, Théophile de Donder, Erwin Schrödinger, Jules-Émile Verschaffelt, Wolfgang Pauli, Werner Heisenberg, Ralph Howard Fowler, Léon Brillouin, Peter Debye, Martin Knudsen, William Lawrence Bragg, Hendrik Anthony Kramers, Paul Dirac, Arthur Compton, Louis de Broglie, Max Born, Niels Bohr, Irving Langmuir, Max Planck, Marie Skłodowska Curie, Hendrik Lorentz, Albert Einstein, Paul Langevin, Charles-Eugène Guye, Charles Thomson Rees Wilson, Owen Willans Richardson

Quantum theory reached its maturity between 1923 and 1930, thanks to the resumption of international contacts and free flow of information. A big role in fostering the progress was played by Solvay conferences that took place in Brussels and financed by Belgian industrialist Ernest Solvay. The first two conferences took place in 1911 and 1913 and resumed in 1921 after an 8-year interruption due to the war. These conferences were attended by all the main players in physics and chemistry of the times. The photograph above (Fig. 1.1) shows the attendees of the fifth conference that took place in 1927 and was a culmination of a struggle between Bohr and Einstein’s views on the interpretation of quantum mechanics. Bohr won, and from that time on, the Copenhagen interpretation dominated physicists thinking about quantum theory.

By the end of the 1920s, the economic situation in Germany started improving, but the market crash of 1929 and the start of Great Depression in the USA interrupted the recovery. The economics of Weimar Republic fell into the abyss facilitating the rise of Nazis to power in 1933. The intellectual atmosphere in Germany and Austria had already begun to deteriorate in the beginning of the 1930s with the rise of the movement for Aryan physics spearheaded by such luminaries

1.2 Brief Overview of the Historical Background 13

as Philipp Lenard and Johannes Stark, but after 1933 the situation for German physicists of Jewish origin became intolerable. Albert Einstein visited the USA in 1933 and never returned; the same year, Max Born was suspended from his position at the University of Göttingen and immigrated first to England and then to the USA. Nazis drove away Jewish scientists not only from Germany and Austria but also from all of Central and Eastern Europe. Here is the incomplete list of the refugee physicists: Hans Bethe, John von Neumann, Leo Szilard, James Franck, Edward Teller, Rudolf Peierls, and Klaus Fuchs. Enrico Fermi, whose wife was Jewish, fled fascist Italy. Ironically, while trying to preserve the racial purity of their science, Nazis destroyed German predominance in physics (as well as in other areas of intellectual and artistic pursuit) and made America into a science powerhouse.

Not only physicists of Jewish origin bore the wrath of the Nazis in Germany and Austria. Erwin Schrödinger, who was known to oppose Nazism, was ordered not to leave the country after Hitler declared Anschluss (union) between Germany and Austria in 1938. Luckily, he managed to escape to Italy, from where he moved to the UK and finally settled in Dublin as the Head of the Institute for Advanced Studies. During this time, he tried to develop a unified field theory, but his most important work of this period was the book What Is Life, where he introduced the idea that complex molecules can contain genetic information. He returned to Vienna in 1956.

However, not all physicists fled, and some even joined the National Socialist Party. Among the most prominent Nazi physicists was the aforementioned Philipp Lenard, who contributed to the discovery of photoelectric effect and won the Nobel Prize in 1905. He was an ardent anti-Semite and dismissed Einstein’s works as “Jewish science.” Lenard lived through the war and was demoted from his emeritus position at Heidelberg University in 1945 by Allied forces.

Especially sad is the case of Born’s student and collaborator Pascual Jordan. He joined the Nazi party in 1933 and even became the member of its SA unit10

and enlisted in Luftwaffe (German Air Force) in 1939. During World War II, he attempted to interest party officials in various weapon schemes but was deemed politically unreliable. Indeed, to his honor, he refused to condemn Einstein and other Jewish physicists. After the war, Wolfgang Pauli interceded on Jordan’s behalf declaring him rehabilitated. This allowed Jordan to continue his academic career and even secure a tenured position in 1953. Still, flirting with Nazism costed him the Nobel Prize, which he would have probably shared with Born.

I also feel obliged to mention that some German physicists who remained in Germany during the Nazi era, while not actively opposing the regime (which would be akin to signing a death sentence), still behaved in a very noble way. Arnold Sommerfeld, who was nominated 84 times for the Nobel Prize, but never won, and was among the earlier contributors to quantum theory, was an admirer of Einstein and made special efforts to help his Jewish students and assistants, such as Rudolf Peierls and Hans Bethe, find employment outside of Germany. Another patriarch

10Sturmabteilung, or SA—the original paramilitary wing of the Nazi party—played a significant role in Hitler’s rise to power.

14 1 Introduction

of German science, Otto Hahn, who made significant contributions to physics and chemistry of radioactivity and won the 1944 Nobel Prize, was an opponent of national socialism. Einstein wrote that Hahn was “one of the very few who stood upright and did the best he could in these years of evil.” For instance, he fostered a longtime collaboration with an Austrian physicist with Jewish roots, Lise Meitner. In 1938, Hahn helped her escape from Berlin to the Netherlands, giving her his mother’s diamond ring to bribe the frontier guards if needed. After the war, Hahn became the founding president of the Max Planck Society in the new Federal Republic of Germany and one of the most respected citizens of the new country.

Werner Heisenberg, on the other hand, found himself in a very difficult situation. While not a Nazi, he thought of himself as a German patriot and believed that Hitler was a necessary evil to save Germany. So, he stayed put during the Nazi period justifying it by the desire to preserve German physics. He also agreed to lead the Nazi nuclear program. Obviously, his relationship with former friends became very strained. It is known that he visited Bohr in 1941 while Denmark was under Nazi occupation. The content of the meeting remains a mystery, and Bohr refused to provide his account of what transpired, but the meeting did not go too well and Bohr was visibly upset. Bohr wrote down his account of the meeting, but it was sealed in his personal papers by the decision of his family. The mystery of this meeting inspired an award-winning play Copenhagen by Michael Frayn, and of the film by the same name, where the role of Heisenberg was played by Daniel Craig (future James Bond in the last three installations of the series: Casino Royale, Quantum of Solace, and Skyfall). I personally found Craig very convincing as Heisenberg, even when he talked about scientific matters. After the war, Heisenberg was cleared of accusations in Nazi collaboration in the course of the denazification process and was allowed to continue his scientific career. However, his behavior during the Nazi time isolated him from other European and American physicists.

Chapter 2 Quantum States

2.1 Classical and Quantum States

The task of any physical theory is to develop the means to predict the results of the measurements of a physical quantity sometime in the future based upon the information about the current state of the system under study and knowledge about interaction of this system with its environment. The term “state” has many different uses in physics—it is used to describe different states of matter (solids, liquids, etc.)—in thermodynamics we derive various equations of state, relations between various thermodynamic parameters. We can also talk about a particular state of a system, meaning specific values of a set of quantities important for a problem at hand. In this book I will consider, for the most part, systems consisting of a small number of particles, placed in a variety of different environments. The states, which I will be dealing with here, are “mechanical states,” but the precise meaning of this term depends upon the choice of a conceptual framework, classical or quantum, with which the problem is approached.

In classical physics a mechanical state is most completely described by specify- ing coordinates and velocities of the particles at any given time. As Laplace said:

Given for one instant an intelligence which could comprehend all the forces by which nature is animated and the respective positions of the beings which compose it, if moreover this intelligence were vast enough to submit these data to analysis, it would embrace in the same formula both the movements of the largest bodies in the universe and those of the lightest atom; to it nothing would be uncertain, and the future as the past would be present to its eyes.

In modern language it means that acting forces and initial coordinates and velocities determine future positions of all bodies in the universe with any accuracy limited only by the accuracy of the available experimental and computational instruments. Mathematically, the evolution of classical states of a single particle is described by ordinary differential equations of the second order, where given values of coordinates and velocities form a set of initial conditions necessary to find their

© Springer International Publishing AG, part of Springer Nature 2018 L.I. Deych, Advanced Undergraduate Quantum Mechanics, https://doi.org/10.1007/978-3-319-71550-6_2

15

16 2 Quantum States

unique solution. I am ignoring here complications arising in the cases of nonlinear chaotic dynamics, when a uniquely existing solution becomes unstable with respect to small variations in initial conditions, so that practical predictability of the equations of motion is lost. These situations are excluded from consideration in this book.

Finding states of a classical particle can be significantly simplified if the system under consideration possesses conserving quantities, called integrals of motion, such as energy or angular momentum. These quantities do not change in the course of the time evolution of the system, and, therefore, their values are determined by the initial conditions. Knowledge of these quantities makes solving the equations of motion easier: for instance, a differential equation of the second order can be reduced to the equation of the first order, or a three-dimensional problem can be converted into its one-dimensional equivalent. At the same time, while simplifying the technical task of solving equations of motion, the existence of integrals of motion does not change the fundamental nature of the classical description of the system.

In a quantum world, we are forced to give up the ability to have complete knowledge of all desirable parameters (coordinates, velocities, energy, etc.) and, therefore, have to redefine the meaning of term “state.” To develop the concept of quantum state, I first introduce the idea of an observable defined as any quantity whose numerical value can be experimentally measured. The list of possible observables is essentially the same as the list of classical parameters: it can include coordinates, momentums, energies, angular momentums, etc. The main difference between quantum and classical descriptions appears, however, when you recognize that in the quantum world, according to the complementarity principle discussed in the Introduction, not all observables can be measured within the same set of experiments. For instance, reinterpreting the Heisenberg’s uncertainty principle, you may say that if a system is in a state with precisely known coordinate, its momentum cannot be prescribed any definite value and vice versa—if the system is in the state with precisely known momentum, its coordinate remains completely undefined. However, you can imagine that there might exist more than one observable, whose values can be found with certainty for the same state of the system (an obvious example of such observables is three components of the position vector or three components of the vector of momentum). Such observables, called mutually consistent, play an important role in quantum theory. The largest set of such observables is called a complete set of mutually consistent observables. Mutually consistent observables often correspond to classical integrals of motion such as energy, angular momentum, etc., which in quantum theory are of much greater importance than in classical mechanics. The complete set of mutually consistent observables is all the information which we can have about a quantum state of a system, and therefore, it seems reasonable to define a quantum state simply as a collection of values of these observables.

To proceed I need to translate the words into some kind of mathematical formulas or at least something which looks like mathematical formulas. So, embrace yourself as I am going to hit you with some heavy formal stuff. Let’s say that the complete

Jacob David

Jacob David

Zoozo 2013

2.2 Quantum States and Hilbert Vector Space 17

set of compatible observables contains Nmax elements ˚ q.i/ � , 1 � i � Nmax: Let q.i/k

represent a k-th value of i-th observable, and let us assume that any given observable qi can either change continuously or take a discrete set of values. In the former case, we say that the observable has a continuous spectrum, while in the latter case, observables are said to have a discrete spectrum. A British scientist Paul Dirac (I will tell you more about him later in the book), one of the founding fathers of quantum theory who shared the 1933 Nobel Prize in Physics with Schrödinger, suggested a rather picturesque and intuitively appealing way of representing a quantum state as

ˇ̌ ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E : (2.1)

For now this is just a fancy notation, but as I will continue to develop quantum formalism, its utility will become more and more apparent. Information contained in Eq. 2.1 is quite transparent: this expression tells us that if the system is in the state given by Eq. 2.1 and we measure observable q.i/, the result of the measurement will be value q.i/k : States, in which at least one of the observables has different values, are mutually exclusive, in a sense that if you repeat the measurement over and over, you will never observe two distinct values of the same observable as long as the measurement is performed on the system in the state described by Eq. 2.1.

The states of the type presented in Eq. 2.1, in which all mutually consistent observables have definite values, are the simplest, but not the only possible states of quantum systems. Actually, as you already know, the whole brouhaha about quantum mechanics and Einstein’s rejection of it was because of the fact that frequently observables would not have definite values, so that one could not predict with certainty an outcome of a measurement. If a measurement is performed on an ensemble of identical systems, all placed in such a state (same for all systems), different members of the ensemble would generate different outcomes in an unpredictable manner. A similar situation arises in the case of repeating measurements on the same system provided it is returned back to its initial state after each measurement. From a theoretical point of view, states with uncertain outcomes of a measurement can be described as a linear superposition of the simple states as discussed in the next section.

2.2 Quantum States and Hilbert Vector Space

2.2.1 Superposition Principle

Experimental evidence of existence of quantum superposition states comes from observations of interference of electrons or neutrons described in any textbook dealing with the basic ideas of quantum mechanics. To explain the connection between interference and superposition, we usually draw on an analogy with electromagnetic waves, where interference is a result of the addition of two spatially

Jacob David

Jacob David

Zoozo 2013

18 2 Quantum States

overlapping coherent waves. Depending upon the phase difference between the waves at any particular point in space, one can observe either bright interference fringes (constructive interference resulting from the phase difference 2�n, where n is an integer) or dark fringes (destructive interference resulting at points where the phase difference is .2nC1/�). Thus, the argument goes, the interference of quantum particles must also result from the superposition of something having wavelike properties. The particle–wave duality was briefly described in the Introduction, and it is also discussed in lots of other textbooks including the already quoted The Feynman Lectures on Physics, so I will not dwell any longer on that. I will, however, use the existence of quantum interference to justify introducing superposition of the states defined in Eq. 2.1. While interference experiments provide convincing but indirect evidence for quantum superposition, recently, superposition states of few photons and atoms have been observed directly.

Let me assume that a quantum system can be in one of two states jq1i and jq2i characterized by different values q1 and q2 of some observable, q, with a discrete spectrum. According to the superposition principle, this system can also be in a state formed by a linear superposition of these two states:

jsuperpositioni D a1 jq1i C a2 jq2i (2.2)

where a1 and a2 are, in general, complex numbers. Equation 2.2 can be interpreted verbally by saying that in order to form a superposition state jsuperpositioni, one must “multiply” each of the states jq1i and jq2i by complex numbers a1 and a2, respectively, and then “add” the results. The problem here is, of course, that since we have no idea about the mathematical nature of the object I call “quantum state” (is it a function, or a number, or some other mathematical object?), I cannot actually tell you how to perform operations I called multiplication and addition and what they actually mean (this is why I surrounded “add” and “multiply” by the quotation marks). Is it possible to make sense of Eq. 2.2 without first assigning some more concrete mathematical meaning to these “quantum states”? It might surprise you, but the answer is yes, it is possible, and mathematicians do this all the time. It is not really necessary to know either the mathematical nature of the state objects or the meaning of algebraic operations we want to perform with them. All what we need is to postulate that these objects and operations exist and possess certain properties. If this sounds too abstract for you, consider this. Modern object-oriented computer languages such as C++ are based exactly on this idea—they introduce an “object,” which can be anything (a number, a matrix, a word, a figure), and define various operations with them such as addition, multiplication, etc. All these operations have well-defined properties, but their concrete meaning depends upon an object to which they are applied. Thus, symbol “C” between two numbers means one thing, while the same symbol between words or matrices mean something completely different, but it still is the same operation because it has the same basic properties.

Zoozo 2013

Zoozo 2013

Zoozo 2013

Zoozo 2013

Zoozo 2013

Zoozo 2013

2.2 Quantum States and Hilbert Vector Space 19

2.2.2 Linear Vector Spaces

I am afraid that now I have to get a bit more abstract with you than you would have probably liked, but do not get too frustrated about it. After all you probably have passed a bunch of calculus classes taught by math professors forcing you to learn proofs of all those theorems of existence and uniqueness. The stuff I am feeding you here is just small potatoes compared to that. Anyway, I will begin by postulating that all quantum states can be represented by special objects belonging to a certain class or “space” defined in such a way that if objects given by Eq. 2.1 belong to this “space,” then the objects defined by Eq. 2.2 also belong to it. (Note that the word “space” here has a completely different meaning than in our everyday language or even in the language of the introductory physics courses and replaces such words as class or set of objects with special properties.) I will also assume that there exists a null object j0i such that jqi C j0i D jqi, 0 � jqi D j0i (where 0 in the last expression is just a number zero unlike j0i, which is used to designate the null state object, and a dot � means multiplication by a number). I will also postulate a few distributive and associative properties (ditching the dot � for the sake of compactness and out of habit) such as

a .jq1i C jq2i/ D a jq1i C a jq2i (2.3) a1 jqii C a2 jqii D .a1 C a2/ jqii (2.4)

a1 .a2 jqii/ D .a1a2/ jqii (2.5)

which allows carrying out standard algebraic operations with the quantum state objects.

A familiar example of objects, which possess all the properties specified in Eqs. 2.2–2.5 provided that the coefficients ai are confined to the set of real numbers, is given by the usual three-dimensional vectors, such as displacement, or velocity vectors used in elementary mechanics. All operations used in Eqs. 2.2–2.5 have in this case specific definitions: multiplication by a number is defined as multiplication of a vector’s length by a number while keeping its direction intact for positive coefficients and reversed for negative coefficients, and the addition of vectors is defined by a triangle or parallelogram rule. It is a matter of simple geometry and algebra to prove the distributive and associative properties of these operations as presented in Eqs. 2.3–2.5.

Abstract objects satisfying the abovementioned quantities are also called vectors and are said to belong to a linear vector space. I will use notation based on Greek letters such as j˛i, jˇi, etc. to represent generic elements of the vector space (not necessarily vectors based on values of certain physical observables). Even though these abstract vectors are quite different from what you are used to calling vectors in classical mechanics (e.g., they do not point to any direction in our regular three-dimensional space and can have infinitely many components, represented by complex numbers), I am going to use some of what you know about regular vectors to help you infer additional properties of our abstract vector objects.

Zoozo 2013

20 2 Quantum States

For instance, you know that regular three-dimensional vectors can be presented as a sum of three mutually perpendicular unit vectors, whose directions are predetermined by an arbitrary choice of three coordinate axes (X, Y , and Z). These unit vectors form what is called a basis in the three-dimensional space—a set of vectors which cannot be expressed as linear combinations of one another but can be used to present any other vector in this space as their linear combination: A D Axex C Ayey C Azez, where ex; ey; ez are the unit vectors in the direction of X, Y , and Z axes, respectively. In order to expand the idea of basis to the arbitrary space of abstract linear vectors, I first have to acquaint you with the concept of linear independence of vectors. A set of vectors jq1i ; jq2i ; � � � jqNi is called linearly independent if none of the vectors in the set can be presented as a linear combination of others. A set of linearly independent vectors is complete if adding any other distinct vector to the set makes it linearly dependent. The number of linearly independent vectors in the complete set determines the dimension of the space. This number does not need to be finite—there are spaces with infinite number of linearly independent vectors. From the definition of the complete set of linearly independent vectors, it follows that any other vector in a given space can be presented as their linear combination:

j˛i D X

i

ai jqii (2.6)

where summation is over all vectors in the complete set. Such a set of vectors is called a basis in a given space. Apparently, representation of an arbitrary vector in the form of Eq. 2.6 is a formal generalization of the physical superposition principle. I can also identify the set of states characterized by different values of the complete set of mutually consistent observables with a set of linearly independent vectors. Indeed, these states are mutually exclusive and correspond to certain values of the corresponding observables so that they cannot be presented as a superposition since the superposition generates states with uncertain values of the corresponding observables. In addition, since I am assuming that I am dealing with a complete set of consistent observables, I cannot add any more states to the set, which means that the set is linearly independent and complete, i.e., can be considered as a basis. Starting with a basis based on the complete set of mutually consistent observables, I can, in principle, form such linear combinations, which would remain linearly independent among themselves even though they would not any longer correspond to definite values of any physical observables. From a purely mathematical standpoint, this set still forms a basis, so that the choice of the basis is not unique.

In addition to generalizing the concept of the basis, I can use the example of three-dimensional geometrical vectors to introduce one more operation involving vectors. You, of course, know that two three-dimensional vectors can be combined to form a “dot” or “scalar” product, which plays an important role in physics. Consider, for instance, the vector of force F and position vector r. Assuming that the force is constant, you can define its work, W, as the scalar product of the force and

Jacob David

Zoozo 2013

Zoozo 2013

Zoozo 2013

Zoozo 2013

Zoozo 2013

Zoozo 2013

Zoozo 2013

Zoozo 2013

2.2 Quantum States and Hilbert Vector Space 21

the position vector: W D F�r. You know how to compute this dot product, actually you know even two different equivalent ways of doing so. The dot product can be computed either as

W D jFj jrj cos� (2.7)

where jFj or jrj are magnitudes of the force and of the position vector, while � is the angle between these two vectors, or as

W D Fxx C Fyy C Fzz (2.8)

where Fx;y;z are components of the force along X, Y , and Z axes of a specified coordinate system correspondingly, while x, y, and z are corresponding components of the position vector (or coordinates, if you wish). The magnitudes of the force and position vectors can be defined via the scalar products of the vectors with themselves, e.g., jFj D pF�F. The magnitudes of the vectors and their dot products possess certain important properties expressed by inequalities listed below:

jAj � 0 (2.9) jA�Bj � jAj jBj (2.10)

jACBj � jAj C jBj : (2.11)

The first of these inequalities as well as the statement that the equality in it is only reached for a null vector is quite trivial for regular vectors. Equation 2.10 follows from the limitation on the values of the cosine function, cos� � 1, and the last of these inequalities expresses a well-known geometrical fact that the sum of any two sides of a triangle is always larger than the third side. The equality in this case can only be reached for degenerate triangles, in which all its sides are aligned along a single line.

Since the magnitude and the dot product play such an important role in application of regular vectors, you would be correct to think that it is a clever idea to introduce the similar operation for the abstract vectors as well. The problem is that since I do not know what my abstract vectors actually are, I cannot give the magnitude and the dot product an operational definition (meaning a prescription how to compute them similar to Eq. 2.7 or 2.8). What I can and will do is to postulate that these operations exist and define them by requiring that whatever they are operationally, they must have the same properties as those given in Eqs. 2.9– 2.11. When talking about abstract vectors, however, it is customary to use the term “norm” instead of the magnitude and “inner product” instead of the dot product. Notation usually used for the norm of an abstract vector j˛i is k˛k, so that Eq. 2.9 takes the form of

k˛k � 0 (2.12)

Jacob David

Zoozo 2013

Zoozo 2013

22 2 Quantum States

which obviously implies that the norm is necessarily real-valued (as opposed to complex-valued). However, an attempt at defining the norm as an inner product of the vector with itself, as well as defining the inner product by simply extending Eq. 2.8 to an arbitrary number of components, results in a problem.

It turns out that unlike the case of regular three-dimensional vectors, in general it is not possible to define the inner product using only vectors belonging to the same vector space. To see why this is so, consider an example of single-column matrices, k; with N rows (1 � N matrix or a column vector) with complex-valued elements. Multiplication by a complex number a�k in this case is defined obviously as

a� k � a

2 666664

k1 k2 :::

kN�1 kN

3 777775

D

2 666664

ak1 ak2 :::

akN�1 akN

3 777775

(2.13)

and produces another column vector. The addition of two column vectors kCp is defined as

k C p �

2 666664

k1 k2 :::

kN�1 kN

3 777775

C

2 666664

p1 p2 :::

pN�1 pN

3 777775

D

2 666664

k1 C p1 k2 C p2 :::

kN�1 C pN�1 kN C pN

3 777775

(2.14)

and also produces a column vector. Obviously, column vectors form a linear space. Now, in the regular matrix algebra, you can define two types of products, inner product and outer product, but neither of them can be introduced using only column vectors. To introduce either of these two operations, you need to combine a column vector with a row vector (matrix N � 1). Now, if you place the row vector to the left of the column vector and use regular matrix multiplication rules (row by the column), you can convert the two vector objects into a number:

� k1 k2 � � � kN�1 kN

�

2 666664

p1 p2 :::

pN�1 pN

3 777775

D NX

iD1 kipi: (2.15)

This is the inner product of the row and the column vectors. If you swap the two objects placing the column to the left of the row, you can generate a N � N matrix using the following rule:

Jacob David

Jacob David

Zoozo 2013

2.2 Quantum States and Hilbert Vector Space 23

2 666664

p1 p2 :::

pN�1 pN

3 777775 � k1 k2 � � � kN�1 kN

� D

2 6664

p1k1 p1k2 � � � p1kN p2k1 p2k2 � � � p2kN :::

: : : : : :

:::

pNk1 pNk2 � � � pNkN

3 7775 : (2.16)

What you get here is the so-called outer or tensor product of the row and the column vectors.

The row vectors do not belong to the same vector space as column vectors (you cannot add a column and a row): they form their own space called adjoint space. In order to establish a proper relationship between the row space and the column space, we need to determine how to convert a column vector into a row vector. It appears that the answer is almost trivial—one can do it using matrix operation known as transposition, which transforms a column vector k into a respective row vector kT . However, it is easy to see that by using simple transposition, you will not be able to generate the inner product and the norm satisfying the conditions presented in Eq. 2.12. Indeed, generalizing Eq. 2.9, you can try introducing the norm using an inner product of a row obtained by transposition of the initial column and the column itself. This procedure will yield

kT � k � �k1 k2 � � � kN�1 kN �

2 666664

k1 k2 :::

kN�1 kN

3 777775

D NX

iD1 .ki/

2: (2.17)

If ki are complex-valued quantities, the result of this multiplication is not necessarily real-valued in clear contradiction with the required property of the norm. The problem, however, can be fixed if in addition to transposing a column vector, you would also complex conjugate its elements. The resulting operation is called Hermitian transposition or Hermitian conjugation, and it turns a column vector k into its adjoint or Hermitian conjugate row vector k�. Using Hermitian conjugation rather than simple transposition turns Eq. 2.17 into

k� � k � �k�1 k�2 � � � k�N�1 k�N �

2 666664

k1 k2 :::

kN�1 kN

3 777775

D NX

iD1 jkij 2 (2.18)

where jkij means the absolute value of the respective complex number. Now you can define the inner product of two column vectors, k and p .k; p/, as an operation

Jacob David

Jacob David

24 2 Quantum States

which involves Hermitian conjugation of a vector on the left and forming an inner product of the resulting row with the parent column: .k; p/ D k��p. It is obvious that the inner product defined this way is not commutative, meaning that .k; p/ ¤ .p; k/. Actually, it is easy to see that .k; p/ D .p; k/�. Using Hermitian conjugation one can also define a new kind of the outer product as well, but I will leave the discussion of the latter till later.

A linear vector space with a defined inner product and a norm becomes something which mathematicians call Hilbert space. (There are some mathematical niceties and details concerning the exact definition of the Hilbert space, but they are of no concern to us here.) I hope that the example with column and row vectors helps you to realize that in order to define the norm and the inner product for our abstract vectors j˛i, you need complimentary vectors inhabiting an adjoint space. Again following Dirac I will designate a vector adjoint to jˇi as hˇj and use notation˝ ˇ ˇ̌ ˛ ˛

for the inner product. In the case of abstract vectors, we do not really know how to actually compute the inner product, but whatever its operational definition might be, we require that it obeys the following condition:

˝ ˇ ˇ̌ ˛ ˛ D �˝˛ˇ̌ˇ˛�� : (2.19)

This property ensures that the norm defined as

k˛k � q˝ ˛ ˇ̌ ˛ ˛

(2.20)

is real and nonnegative. Indeed, applying Eq. 2.20 to the case when jˇi D j˛i, you have h˛j ˛i D .h˛j ˛i/�, proving that h˛j ˛i is real-valued.

To distinguish between adjoint vectors in speech, we call j˛i a ket vector and h˛j—a bra vector. These terms have been introduced by Paul Dirac together with the respective notation. Their origin can be traced to word “bracket” split in two halves “bra - ket” just like angular brackets in h˛j ˛i are split in two vectors (what happened to letter “c” in the process is anybody’s guess).

So far, we have no operational prescription on converting a single generic ket into a respective bra: all what you need to do is just to change the orientation and position of the angular bracket from ji to hj. However, an important question which we need to figure out now is how to do this conversion in the case of such expressions as a j˛i or even more complex expressions of the kind of Eq. 2.2. What is clear is that the adjoint of this expression must look like

.a j˛i/� Dea h˛j ;

where I used symbol � to designate conversion to the adjoint space (or performing Hermitian conjugation just like in Eq. 2.18). Now I need to find how coefficientsea are related to a. To this end let me compute the norm ka˛k:

ka˛k2 Deaa h˛j ˛i Deaa k˛k2 :

Jacob David

Jacob David

Jacob David

Jacob David

Jacob David

WHY?!!!

Jacob David

Jacob David

WHY DOES this PROVE that??

Jacob David

Jacob David

Jacob David

Jacob David

WHY?!

2.2 Quantum States and Hilbert Vector Space 25

For this expression to be real-valued for all possible coefficients a, I have no choice but to require that ea D a�. This yields a simple rule for Hermitian conjugation of complex expressions involving ket vectors: convert all kets into bras by simply replacing one angular bracket with the other and complex conjugate all numerical coefficients appearing in the original expression. Here are a few examples of application of this rule:

Example 1 (Hermitian Conjugation) Perform Hermitian conjugation of the follow- ing expression:

j˛i D .2i C 3/ jq1i C 5i jq2i :

Solution

h˛j D .�2i C 3/ hq1j � 5i hq2j :

Example 2 (Norm Calculation) If kq1k D 2, kq2k D 4, and hq1j q2i D i � 1, find the norm of j˛i defined in the previous example. Solution

k˛k2 D ..�2i C 3/ hq1j � 5i hq2j/ ..2i C 3/ jq1i C 5i jq2i/ D .�2i C 3/ .2i C 3/ kq1k2 C 5i .�5i/ kq2k2 C 5i .�2i C 3/ hq1j q2i � 5i .2i C 3/ hq2j q1i D

13 � 4C 25 � 16C .10C 15i/ .i � 1/C .10 � 15i/ .�i � 1/ D 402 k˛k D p402:

It will be shown in the future development of the formalism that vectors j˛i and a j˛i, where a is an arbitrary complex coefficient, describe the same quantum state, i.e., contain the same information about the system. Therefore, it is often convenient to deal only with the states whose norm is equal to unity (in a way such states are analogous to unit vectors in a regular three-dimensional space). Such vectors can be produced via normalization procedure, which consist in replacing an original vector j˛i with the vector j˛i = k˛k. In the future, we shall assume that all abstract vectors used in calculations are normalized, and those which are not will have to be normalized before any calculations with them are performed.

Example 3 (Normalization) Normalize the following state:

j˛i D 2 jq1i C 3i jq2i � 2 3

jq3i ;

26 2 Quantum States

assuming that the norm of all vectors jqii is unity and that inner products involving pairs of different vectors are all equal to zero.

Solution

1st step: Hermitian conjugation

h˛j D 2 hq1j � 3i hq2j � 2 3

hq3j :

2nd step: Forming the inner product

k˛k2 D h˛j ˛i D 2 hq1j � 3i hq2j � 2

3 hq3j

2 jq1i C 3i jq2i � 2

3 jq3i

D

4C 9C 4 9

D 121 9 :

3rd step: The normalization

k˛k D r 121

9 D 11

3

j˛iN D 3

11

2 jq1i C 3i jq2i � 2

3 jq3i

:

I will complete this discussion of vector states in their ket and bra reincarnations by considering another important example of vector spaces and respective inner products and norms. Consider a class of complex functions, .x/, of a single variable x defined over domain x 2 Œ�1;1�, which also satisfy the condition that

´ 1 �1 j .x/j2 dx < 1. It is very easy to show that linear combinations of such

functions also belong to the same class, so they do form a linear vector space. The inner product of two functions .x/ and '.x/ defined as

. ; '/ D ˆ 1

�1 �.x/'.x/dx (2.21)

satisfies condition presented by Eq. 2.19, and the respective norm

k k D sˆ 1

�1 j .x/j2 dx

is obviously real-valued and nonnegative. Thus, these functions, called square- integrable, do form a Hilbert vector space, in which for each ket vector j˛i � .x/, there is an adjoint bra vector h˛j � �.x/ with an inner product and a norm defined as respective integrals.

2.2 Quantum States and Hilbert Vector Space 27

So, I gave you two very different concrete realizations of abstract Hilbert space: column vectors and square-integrable functions with two different operational definitions of the inner product. Despite the difference in the operational meaning of inner product in these two cases, they all had the same defining properties.

2.2.3 Superposition Principle and Probabilities

States characterized by definite values of the complete set of mutually consistent observables have an important mathematical property, which I cannot yet prove (it will be done later), but it is important for the argument I am trying to present, so you will have to trust me on this for now. So, here is the property:

D q.1/l ; q

.2/ n ; � � � q.Nmax/r

ˇ̌ ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E D ıl;kın;m � � � ır;p; (2.22)

where I assumed that the state vectors are normalized (ıl;k is Kronecker delta, which is equal to unity for coinciding indexes, and zero otherwise). This equation surely looks rather mysterious, but its actual content is rather simple. To see this imagine that you are dealing with a system described by a single observable, say, energy. To make the matter even simpler, assume also that the energy can only take two values 0 and 1. Then you are dealing with only two states with definite values of energy j1i and j0i. Equation 2.22 in this case simply states that

h0 j1i D 0; h0 j0i D h1 j1i D 1:

Abstract vectors, whose inner product is equal to zero, are called orthogonal in the obvious generalization of the concept of orthogonality for regular perpendicular three-dimensional vectors, whose dot product is equal to zero. What I am driving at here is the connection between the mathematical concept of orthogonality and the physical notion of mutual exclusivity discussed in Sect. 2.1: Eq. 2.22 does say that mutually exclusive states are also orthogonal. If the basis is constructed of mutually orthogonal and normalized vectors, it is called orthonormalized. Such bases are particularly convenient to work with, and, therefore, they are used almost exclusively in practical calculations. Here is an example of a system of vectors forming an orthonormal basis, which you are quite familiar with.

Example 4 (Fourier Series: Examples of a Discrete Orthonormal Basis) Consider a set of functions of a single variable x defined on a finite interval Œ0;L�. These functions obviously form a linear space with an inner product defined as

h f j gi D Lˆ

0

f �.x/g.x/dx:

Jacob David

Jacob David

28 2 Quantum States

It is well known that these functions can be expended into a Fourier series1

f .x/ D r 1

L

1X �1

ane i2�nx=L

with expansion coefficients given by

an D r 1

L

Lˆ

0

f .x/e�2i�nx=Ldx:

One can identify j�ni � q

1 L e

i2�nx=L with vectors of an orthonormal basis so that the Fourier series expansion of the function can be presented in the form of Eq. 2.6. Indeed, it is easy to see that

h�mj �ni D 1 L

Lˆ

0

dxe�2i�mx=Le2i�nx=L D ( 1 m D n 0 m ¤ n :

Mutual exclusivity can be more formally expressed in terms of probabilities: if a quantum system is in a state with prescribed values q.1/l ; q

.2/ n ; � � � q.Nmax/r of a complete

set of mutually consistent observables, then the measurement of the observables q.s/

will produce value q.s/p appearing in the definition of the state with a probability equal to 1, while the probability of any other value is equal to zero. Now, let’s turn our attention to the superposition state of the type presented in Eq. 2.2 and ask a question: what should we expect if we measure observable q and the system is in the state given by this equation? You already know that the exact result of the measurement cannot be predicted, but it is intuitively clear that it must be either q1 or q2. The only question you need to ponder on now is that if the measurement is repeated multiple times with the system always brought back to the same initial state (or if, instead of one system, you got your hands on an ensemble of identical systems all in the same state), what are the fractions of measurement outcomes yielding q1 and q2? It seems reasonable to assume that it is coefficients a1;2 in the superposition which will determine an answer to this question. Indeed, if I set either a1 or a2 to zero, I will take you back to the state in which all measurements yield either q1or q2, respectively, while its counterpart is never observed. It is natural to describe this situation in terms of probabilities and assume that these probabilities are determined

1There are some conditions on the required smoothness of the functions for their Fourier series actually to represent them accurately, but leave it to mathematicians to worry about these details. Here I just want to mention that the representation as a Fourier series actually works even for functions, which are not necessarily continuous.

2.2 Quantum States and Hilbert Vector Space 29

by the coefficients ai. However, it is clear that I cannot identify the coefficients themselves with the probabilities because ai are not necessarily positive or even real (if this were not the case, we could not describe both constructive interference and destructive interference just like in the case of electromagnetic waves). At the same time, you can recall that in the case of wave interference, it is not the amplitudes of the waves but their intensities proportional to the absolute values of the squared amplitudes that determine the brightness of the interference fringes. Thus, we can surmise that in the case of quantum superposition, respective probabilities are given by jaij2:

p.qi/ D jaij2 : (2.23)

Multiplying Eq. 2.2 by hq1j or hq2j (the bra counterparts of respective vectors jqii) from the left and using the orthogonality condition, Eq. 2.22, I can derive for the coefficients ai

hq1j˛ ˛ D hq1j .a1 jq1i C a2 jq2i/ D a1

��q1 ��2 ) a1 D hq1j˛

˛

hq2j˛ ˛ D hq2j .a1 jq1i C a2 jq2i/ D a1

��q2 ��2 ) a2 D hq2j˛

˛ ; (2.24)

where I took into account the convention that all vectors describing quantum states are presumed to be normalized. Expressions derived in Eq. 2.24 allow presenting Eq. 2.23 for probability in a more generic form:

p.qi/ D ˇ̌hqij˛

˛ˇ̌2 : (2.25)

Applying Eq. 2.25 to the case of j˛i D ˇ̌qj ˛ , you find p .qi/ D ıi;j establishing

formal correspondence between notions of mutual exclusivity and orthogonality. Computation of the norm of the state

ˇ̌ ˛ ˛

yields

��˛��2 D ˇ̌a1 ˇ̌2 C ˇ̌a2

ˇ̌2 � p1 C p2:

If ��˛�� D 1, i.e., the state j˛i is normalized as presumed, then you obtain relation

p1 C p2 D 1 in complete agreement with what is expected of the probabilities. This result reinforces my (well, actually Max Born’s) suggestion to interpret

ˇ̌hqij˛ ˛ˇ̌2

as a probability that the measurement of observable q on a system in state j˛i will produce qi.

It is important to emphasize that any uncertainty in the result of the measurement of the observable q exists only before the measurement has taken place and that the probability referred to in this discussion describes the number of given outcomes in the series of such measurements. After the measurement is carried out, and one of the values of the observable is actually observed, all uncertainty has disappeared. We now know that the measurement yielded a particular value qi, which, according to our earlier proposition, is only possible if the system is in the respective state

Jacob David

Jacob David

WHAT IS ALPHA?

30 2 Quantum States

jqii. Thus we have to conclude that the act of measurement has destroyed the initial state j˛i and “collapsed” it into state jqii. The most intriguing question, of course, is what determines the state in which the system collapses into. This question has been debated during the entire 100+ year-long history of quantum mechanics and is being debated still. The orthodox Copenhagen interpretation of quantum mechanics essentially leaves this question without an answer claiming that the choice of the final (after measurement) state is completely random.2 I propose that you accept this interpretation as quite sufficient for most practical purposes, while it does leave people with a philosophical state of mind somewhat unsatisfied.

Equation 2.2 describes superposition of only two states. It is not too difficult to imagine that it can be extended to the case of the arbitrary number of statesˇ̌ ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E generated by mutually consistent observables with discrete

spectrum:

j˛i D X

k;m��� ;p ak;m��� ;p

ˇ̌ ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E : (2.26)

This sum, in principle, can contain any number of terms, including an infinite amount. In the latter case, of course, one have to start worrying about its con- vergence, but I will leave these worries to mathematicians. Coefficients ak;m��� ;p appearing in this equation have the same meaning as the coefficients in the two-state superposition expressed by Eq. 2.25, in which jqii is replaced with a more general state appearing in Eq. 2.26.

2At a talk given at the physics department of Queens College in New York in 2014, British mathematician J.H. Conway (currently Professor Emeritus of Mathematics at Princeton University) dismissed the randomness postulate of the Copenhagen interpretation as a “cop-out” and also because the use of probabilities only makes sense when one deals with a well-defined ensemble of events or particles, which is not true in the case of a single electron or photon. At the same time, he and S. Kochen (Canada) proved a mathematical theorem asserting that the entire structure of quantum mechanics is inconsistent with the idea of existence of some unknown characteristics of quantum systems, which would, shall we find them, provide deterministic description of the system. In this sense, they proved completeness of the existing structure of quantum theory and buried the idea of “hidden variables”—unknown elements of reality, which could restore determinism to the quantum world—provided that we are unwilling to throw away the entire conceptual structure of quantum mechanics, which, so far, gave excellent quantitative explanation of a vast number of the experimental data. The Conway and Kochen theorem is called “free will theorem” because it can be interpreted as an assertion that electrons, just like humans, have “free will,” which in strict mathematical sense means that electron’s future behavior might not be a deterministic function of its past. The description of the theorem can be found here: https://en. wikipedia.org/wiki/Free_will_theorem.

2.3 States Characterized by Observables with Continuous Spectrum 31

2.3 States Characterized by Observables with Continuous Spectrum

In the previous section, I considered only states generated by observables with discrete spectrum. As a result, even though the number of states in Eq. 2.26 can be infinite, they are still countable (one can enumerate them using natural numbers 1; 2; 3; � � � ). Some observables, however, have continuous spectrum, meaning that they can take values from a continuous (finite or infinite) interval of values. One of such important observables is a particle’s position, measured by its position vector r or a set of Cartesian coordinates .x; y; z/, defined in a particular coordinate system. It is interesting to note in this regard that while in classical mechanics, descriptions using Cartesian coordinates are largely equivalent to those relying on spherical or polar coordinates, it is not so in quantum description, where angular coordinates in spherical or cylindrical systems do not easily submit to quantum treatment. This comment obviously appears somewhat cryptic here, but its meaning will be clarified in the subsequent chapters. Another peculiarity of the position observable is the need to carefully distinguish between coordinates being characteristics of a particle’s position and coordinates being markers of various points in space, needed to describe position dependence of various mathematical and physical quantities.

Other observables, such as energy or momentum, might have either continuous or discrete spectrum depending upon the environment, in which a particle finds itself, or might have mixed spectrum, where an interval of discretely defined values crosses over into an interval of continuously distributed values.

Two main peculiarities of states characterized by observables with continuous spectrum are that (1) they cannot be normalized in a regular sense of the word and (2) the concept of probability as defined by Eq. 2.25 loses its meaning because in the case of continuous random variables, the probability can only be defined for an interval (which might be infinitesimally small) of values, but not for any particular value of the variable. These two features are not independent and are related to each other as it will be seen from the future analysis.

I will illustrate the properties of states corresponding to observables with continuous spectrum using the position of a particle as an example. Assuming that there are no other observables mutually consistent with the position, I will present a state in which this observable has a definite value r as jri. In order to construct a superposition state using these vectors, I have to replace the sum in Eq. 2.26 with an integral over all possible values of position vector r introducing instead of coefficients ak with discrete indexes a function .r/ of a continuous variable:

j˛i D ˆ

d3r .r/ jri : (2.27)

Now I want to compute the norm k˛k of the superposition state given by Eq. 2.27. The Hermitian conjugation of Eq. 2.27 produces the respective bra vector

h˛j D ˆ

d3r �.r/ hrj (2.28)

Jacob David

WHAT IS THIS?
How do you do this?�

32 2 Quantum States

so that the norm becomes

k˛k2 � h˛j ˛i D “

d3r1d 3r2

� .r1/ .r2/ hr1j r2i ; (2.29)

where I had to rename the integration variables in order to be able to replace the product of integrals with a double integral (note that r1 appears in those parts of Eq. 2.29, which originate from the bra vector of Eq. 2.28, and r2 appears in the ket- related parts of the integral). States jr1i and jr2i remain mutually exclusive even in the case of continuous spectrum as long as r1 ¤ r2. Thus I can write based on the discussion in the previous sections that hr1j r2i D 0 for r1 ¤ r2. If I now require that hr1j r2i D 1 for r1 D r2, which would correspond to the “regular” normalization condition, I will end up with an integral, in which the integrand is zero everywhere with exception of one point, where it is finite. Clearly such an integral would be zero in contradiction with the properties of the norm (it can only be zero for null- vector). To save the situation, something has to give, and I have to reject one of the assumptions made when evaluating Eq. 2.29. The mutual exclusivity of jr1i and jr2i and the related requirement that hr1j r2i D 0 for r1 ¤ r2 are connected with the basic ideas discussed in Sect. 2.2.3 and, therefore, appears untouchable. So, the only choice left to me is to reject the assumption that hr1j r1i is equal to unity or takes any other finite value. As a result we are left with the following requirements on hr1j r2i: this expression must be zero for unequal values of its arguments while producing a non-zero result when being integrated with any “normal” function.

These requirements are satisfied by an object called Dirac’s delta-function. Dirac introduced notation ı .x/ for its simplest single-variable version and presented most of its properties in the form useful for physicists in his influential 1930 book The Principles of Quantum Mechanics, which since then has been reissued many times (it is the same Paul Dirac who introduced the bra-ket notation for quantum states). It seems a bit unfair to name this object after him because it was already known to such mathematicians as Poisson and Fourier in the nineteenth century, but physicists learned about it from Dirac, so we stick to our guns and call it a Dirac’s function. The first thing one needs to understand about the delta-function is that it is not a function in any reasonable sense of the word. Therefore, the meaning of such operations as integration or differentiation involving this object cannot be defined following the standard rules of regular calculus. Nevertheless, physicists keep working with this object as though nothing is wrong (giving nightmares to rigor-sensitive mathematicians) with the only requirements that the results of all performed operations must make sense (that is from a physicist’s perspective). Mathematicians call such objects “distributions” or treat them as examples of “functionals.” Below I supply you with all properties of the delta-functions you will need to know.

The main defining property of the delta-function of a single variable is

x2ˆ

x1

f .x/ı.x/dx D (

f .0/; 0 2 Œx1; x2� 0 0 … Œx1; x2�

(2.30)

2.3 States Characterized by Observables with Continuous Spectrum 33

with its immediate generalization to

x2ˆ

x1

f .x/ı.x � x0/dx D (

f .x0/; x0 2 Œx1; x2� 0 x0 … Œx1; x2� :

(2.31)

These equations express the main property of the delta-function—it acts as a selector singling out the value of function f .x/ at x D x0; where the argument of the delta- function vanishes. In a particular case of f .x/ D 1, Eq. 2.31 yields another important characteristics of the delta-function:

x2ˆ

x1

ı.x � x0/dx D ( 1; x0 2 Œx1; x2� 0 x0 … Œx1; x2� ;

which expresses the idea that while the “width” of the delta-function is zero and its “height” is infinite, the area covered by it is equal to unity. An example of actual limiting procedure producing a delta-function out of a regular function based on this idea can be found in the exercises in this chapter.

One can also define the delta-function of a more complex argument such as ı Œg.x/�, where g.x/ is an arbitrary function. If g.x/ has only one zero at x D x0, I can define ı Œg.x/� by replacing g.x/ with the first term of its Taylor expansion around x0: g.x/ � b.x � x0/, where b � .dg=dx/xDx0 , and making a substitution of variableex D b .x � x0/, which yields

x2ˆ

x1

f .x/ı Œg.x/� dx D 1jbj

g.x2/ˆ

g.x1/

f

ex b

C x0 ı .ex/ dex D 1jbj f .x0/; (2.32)

The expansion of g.x/ in the Taylor series is justified here because the value of the integral is determined by the behavior of this function in the immediate vicinity of x0.

If the function g.x/ has multiple zeroes within the interval of integration, then we must isolate each zero and perform the procedure described above for each of them. The result will look something like this:

x2ˆ

x1

f .x/ı Œg.x/� dx D X

i

1

jbij f .x .i/ 0 / (2.33)

where bi is the value of the derivative of g.x/ at the respective i-th zero x .i/ 0 . To

illustrate this procedure, consider an example.

34 2 Quantum States

Example 5 (Delta-function With Two Zeros) Consider

g.x/ D x2 � x20:

In this case the method outlined above yields

x2ˆ

x1

f .x/ı � x2 � x20

� dx � 1

2x0

2 4

x2ˆ

x1

f .x/ı .x � x0/ dx C x2ˆ

x1

f .x/ı .x C x0/ dx 3 5

(2.34) where I assumed that both x0 and �x0 belong to the interval between x1 and x2. I can also define a derivative of the delta-function using integration by parts and assuming that integral of df=dx is still equal to f .x/ even if f .x/ � ı .x/. This is how it goes:

x2ˆ

x1

f .x/ı0.x � x0/dx D f .x/ı.x � x0/jx2x1 � x2ˆ

x1

ı.x � x0/f 0.x/dx

D 8< :

� dfdx ˇ̌ ˇ xDx0

; x0 2 Œx1; x2� 0 x0 … Œx1; x2� :

(2.35)

Similarly one can define higher derivatives of the delta-function. We will also need an important representation of the delta-function as a Fourier

transform:

ı.x/ D 1 2�

1̂

�1 eikxdk: (2.36)

To demonstrate that this representation of the delta-function actually makes sense, consider direct and inverse Fourier transforms:

f .x/ D 1p 2�

1̂

�1 Qf .k/eikxdk (2.37)

Qf .k/ D 1p 2�

1̂

�1 f .x/e�ikxdx: (2.38)

Substituting Eq. 2.38 into Eq. 2.37, I get

f .x/ D 1 2�

1̂

�1 f .x1/e

�ikx1eikxdkdx1 D 1 2�

1̂

�1 dx1f .x1/

1̂

�1 eik.x�x1/dk

Zoozo 2013

2.3 States Characterized by Observables with Continuous Spectrum 35

and the only way to make this into an identity for any function is to accept Eq. 2.36 for the integral over k.

Finally, you will need a generalization of the delta-function to the case of several variables. For instance, delta-function involving position vectors in Cartesian coordinates can be defined as

ı .r1 � r2/ � ı .x1 � x2/ ı . y1 � y2/ ı .z1 � z2/ ; (2.39)

in which case its representation in the form of a Fourier transform becomes

ı .r1 � r2/ D 1 .2�/3

1̂

�1 eikx.x1�x2/eiky. y1�y2/eikz.z1�z2/dkxdkydkz D

1

.2�/3

1̂

�1 eik�.r1�r2/d3k: (2.40)

Now back to the calculation of the norm, i.e., to Eq. 2.29. To complete this calculation, I will introduce a generalized, so-called delta-function normalization condition for states jri by requiring that

hr1j r2i D ı .r1 � r2/ : (2.41)

Substituting Eq. 2.41 into Eq. 2.29 and using the properties of the delta-function, I finally arrive at

k˛k2 D ˆ

d3r � .r/ .r/ : (2.42)

Now, in order to ensure correct normalization of the state j˛i, you only need to require that function .r/ is chosen to be normalized such that

ˆ d3r j .r/j2 D 1: (2.43)

Example 6 (Normalization of a Wave Function) Normalize the state j˛i presented by the following function.

.x/ D eikxe�ax2=2:

Zoozo 2013

Zoozo 2013

Zoozo 2013

36 2 Quantum States

Solution

Using the definition of the norm, Eq. 2.42, I have

k˛k2 D 1̂

�1 �.x/ .x/dx D

1̂

�1 e�ax2dx D

r �

a

where I used the substitution of variables y D pax and the well-known integral 1̂

�1 exp

��y2� dy D p�:

Thus, the normalized form of the state can be written as

j˛i D � a �

�1=4 1̂

�1 eikxe�ax2=2 jxi :

Function .r/ in these expressions is called the wave function and is often cited in quantum mechanics textbooks as the descriptor of a quantum state. You can see now that this is not quite the case—the definition of the wave function involves two different states: the actual state of the system j˛i and a state, jri, in which a particle would have a definite position. You can think of the wave function as a projection of j˛i on jri. If the position r could only take discrete values, we would have interpreted j .r/j2 as a probability. In the continuous case, however, we can only ask about a probability that the measurement of position would produce a result within a certain (possibly infinitesimally small) volume around some central point r. The answer to this question is well expressed in terms of differential probability

dP .r/ � d3r j .r/j2 ;

where j .r/j2 can be interpreted as the position probability density. The probability that the measured position vector belongs to some finite volume V is given by expression

P.V/ D •

V

d3r j .r/j2 ; (2.44)

while the normalization condition 2.43 simply states the fact that the measurement of the particle’s position will produce some value within the entire volume available to the particle with probability equal to one.

2.4 Problems 37

2.4 Problems

Problem 1 Consider two states:

j 1i D j 1i C i j 2i � 2 j 3i j 2i D � j 1i C 2 j 2i � i j 3i ;

where j 1;2;3i are all normalized and orthogonal to each other. 1. Normalize states j 1i and j 2i. 2. Find adjoint counterparts of these states h 1j and h 2j. 3. Compute inner products h 1j 2i and h 2j 1i and verify that h 1j 2i D

h 2j 1i�. 4. Find a linear combination of states j 1i and j 2i that would be orthogonal to

j 1i. 5. Compute .h 1j C h 2j/ .j 1i C j 2i/. Do it in two ways: (a) by computing the

sums first and then taking the inner product and (b) using the distributive property of the inner product, remove the parenthesis and compute the inner products of the resulting individual terms.

Problem 2 Determine if the following sets of vectors, defined by their components in some basis, are linearly dependent or independent:

1. .2; 2; 0/, .1; 0; 1/, .0; i;�1/ 2. .0; 0; 1/, .i; 0; 0/, .0; 0;�1/ 3. .1; i; 2/, .1; i;�1/, .i;�i; 2i/ Problem 3 Consider the set of functions

fn.x/ D A sin �nx L ;

where L is a positive quantity.

1. Prove that these functions form an orthogonal system with the inner product defined as

h fnj fmi � Lˆ

0

fn.x/fm.x/dx:

2. Normalize these functions. 3. Find an expression for coefficients cn in the expansion

.x/ D 1X

nD0 cnfn.x/;

38 2 Quantum States

where fn.x/ is given by the expression from the first part of the problem with amplitude A replaced by the normalization coefficient found in Part 2.

Problem 4 Repeat problem 3 with the set of functions

'n.x/ D exp

i 2�nx

L

;

and the inner product defined as

h'nj 'mi � Lˆ

0

'�n .x/'m.x/dx:

Problem 5 Consider functions g1.x/ D x, g2.x/ D x2, and g3.x/ D x3 defined on an interval x 2 Œ�1; 1� with inner product defined as

hgnj gmi D 1ˆ

�1 dxgn.x/gm.x/:

1. Which of these three functions are mutually orthogonal, and which are not? 2. Consider linear combination of functions g1.x/ and g3.x/: ag1.x/ C bg3.x/ and

find coefficients a and b which would make this function orthogonal to g1.x/. 3. Find a different linear combination of the same functions, which would be

orthogonal to g3.x/. 4. Are these two new functions orthogonal to each other?

Problem 6 Consider a wave function of the form

.x/ D Aeikxxe�x2=2:

1. Normalize this function using the standard definition of the inner product for the square-integrable functions.

2. Find the probability that a measurement of the x-coordinate of the particle will produce a value between 0 � x � p2.

3. Find the probability that a measurement of the x-coordinate of the particle will produce a value such that x > 2. Use mathematical tables or any available computational tools to obtain the numerical values.

Problem 7 Consider a function of the form

f .x/ D (

1 4 jxj < 4=2 0 jxj > 4=2:

Show that this function turns into a Dirac’s ı-function in the limit 4 ! 0.

2.4 Problems 39

Problem 8 Compute the following integrals:

1.

1ˆ

0

� x3 C 5x� ı .x � 2/ dx

2.

3ˆ

0

.sin 2x C 2 tan 3x/ ı �x2 � 5x C 4� dx

3.

1̂

�1 xe�x2ı0 .x C 5/ dx

Problem 9 Evaluate the following expression:

1̂

�1 dxf .x/

1̂

�1 dkkeik.x�x0/:

Hint: Use the representation of the delta-function as a Fourier integral to figure out the integral with respect to k.

Chapter 3 Observables and Operators

3.1 Hamiltonian Formulation of Classical Mechanics

The version of classical mechanics based on forces and Newton’s laws resists any meaningful reformation into a quantum theory because it depends critically on such concepts (trajectory, acceleration, etc.) that do not correspond to any observable reality in the quantum world. More productive for finding links between classical and quantum realms is an alternative formulation, where energy rather than force takes the central role. There are two essential elements in this formulation of classical mechanics. One is the idea of canonical coordinates in the so-called phase space (as opposed to regular three-dimensional configuration space), and the other is the concept of Hamiltonian.

Points in the phase space represent classical states of the system, characterized, for instance, by its coordinates xi and components of the momentum vector pi. For a single particle moving along a straight line (one-dimensional motion), the phase space is two-dimensional; for the fully three-dimensional motion, the phase space is six-dimensional; and for a three-dimensional motion of N particles, the dimension of the phase space is 6N. Each point in the phase space represents the most complete information about a classical system—its coordinates and velocities. When particles move, their coordinates and momentums change, drawing a phase trajectory of the system in the phase space. For a single particle allowed to move only along a straight line, this trajectory is a curve in two-dimensional space. If the motion of the particle is conservative, i.e., its energy is a conserving quantity, each phase trajectory is an equienergetic line—each point on the trajectory corresponds to the state of the system with exactly the same energy (energy does not change, while particles change their position and momentum). Using the phase space, we effectively put space coordinates and momentum of the particles on equal footing without imposing any a priori relationships between them (as opposed to elementary mechanics, when the momentum is defined via the time derivative of coordinates). You shall see that

© Springer International Publishing AG, part of Springer Nature 2018 L.I. Deych, Advanced Undergraduate Quantum Mechanics, https://doi.org/10.1007/978-3-319-71550-6_3

41

42 3 Observables and Operators

the relationship between coordinate and momentum, called canonically conjugate variables, arising within this framework is much closer to its quantum version than it would have been in the Newtonian approach.

The Hamiltonian is essentially the energy of a conservative system expressed in terms of coordinates and momentum H.p;r/, which in the case of a single particle takes the form of

H.p; r/ D p 2

2m C V.r/; (3.1)

where p is the momentum vector1 and V.r/ is the potential energy of the particle in the external field. The Hamiltonian occupies a special place in classical mechanics (as compared, for instance, to angular momentum, which can also be a conserving quantity under certain circumstances) because it determines system’s dynamics via Hamiltonian equations, which can be formulated as

dpi dt

D �@H @ri

(3.2)

dri dt

D @H @pi ; (3.3)

where ri and pi (i D 1; 2; 3/ are Cartesian components of the position and momentum vectors x; y; z and px; py; pz, respectively. Hamiltonian equations can be rewritten in another interesting form using so-called Poisson brackets f f ; gg defined for two arbitrary functions of canonical variables:

f f ; gg D NX

iD1

@f

@ri

@g

@pi � @f @pi

@g

@ri

: (3.4)

Summation in Eq. 3.4 is over all relevant canonical conjugated pairs of coordinates. It is easy to see that the Poisson brackets for momentum and corresponding coordinates are

˚ ri; pj

� D ıi;j: (3.5)

This form of Poisson brackets is called canonical: any pair of variables possessing Poisson brackets of this form form a canonically conjugated pair and satisfy Hamiltonian equations 3.2 and 3.3.

Applying the definition of the Poisson brackets, Eq. 3.4, to the pair of functions pi, H and ri, H, you can find (check it out!)

1p2 is defined as usual as the square of the magnitude of the vector in Cartesian coordinates p2x C p2y C p2z .

3.2 Operators in Quantum Mechanics 43

f pi;Hg D �@H @ri ;

fri;Hg D @H @pi ;

so that Hamiltonian equations 3.2 and 3.3 can be rewritten even in a more symmetric form:

dpi dt

D f pi;Hg ; (3.6) dri dt

D fri;Hg : (3.7)

Finally, the time derivative of an arbitrary function of canonical coordinates can be expressed in terms of Poisson brackets involving Hamiltonian. I illustrate this statement for a function of only one pair of coordinates f .x; p; t/:

df

dt D @f @t

C @f @p

dp

dt C @f @x

dx

dt D @f @t

� @f @p

@H

@x C @f @x

@H

@p D @f @t

C f f ;Hg : (3.8)

3.2 Operators in Quantum Mechanics

3.2.1 General Definitions

The main task of quantum theory is to be able to predict (or explain) results of experiments conducted with quantum systems. All such experiments involve taking a system in some initial state, subjecting it to external influences, which change its environment, and observing a reaction of the system to these changes. A theo- retician in me would say that by doing all these manipulations and measurements, experimentalists change the quantum state of the system, but so far the formalism I have at my disposal does not have any theoretical representation of all these turning knobs and dials, lasers, which go on and off, magnets, thermostats, and all other real material objects in the arsenal of an experimentalist. I need additional mathematical tools, which would allow me to describe theoretically all these changes inflicted upon an unsuspecting system by the men in lab coats. Since the quantum states are presented in the theory by vectors of a linear vector space, what I need are objects that can change these vectors. Such objects are known to mathematicians—they call them operators. The role of operators in quantum theory is twofold. On one hand, they are used to describe transformations of state vectors, and on the other hand, they provide the theoretical means to predict the outcomes of measurements of observables.

44 3 Observables and Operators

From the mathematical standpoint, an operator is a rule prescribing how to change one abstract vector of a linear vector space, say, j˛i, into another abstract vector, say, jˇi of the same or a different vector space. Symbolically this can be represented as,

jˇi D OT j˛i ; (3.9)

where the “hat” O above a capital letter (in this case T) signifies that OT represents such a rule, or an operator, “acting” on j˛i and converting it into jˇi. Note that the symbol of the operator appears in Eq. 3.9 next to the vertical line marking the “tail” of the ket j˛i.

The special role in quantum mechanics and other applications is played by linear operators—the class of rules satisfying the following condition:

OT .a1 j˛1i C a2 j˛2i/ D a1 OT j˛1i C a2 OT j˛2i : (3.10)

Here are a few examples of linear operators:

1. Differentiation operator d=dx converting a function f .x/ into its derivative g.x/ D .d=dx/ f � df=dx (note how the operator symbol appears on the left of the function)

2. Gradient operator �!r D [email protected][email protected] C [email protected][email protected] C [email protected][email protected], where ex;y;z are unit vectors

in the directions of the respective coordinate axes, converting a scalar function of three spatial variables into a vector:

�!r f .x; y; z/ D [email protected][email protected] C [email protected][email protected] C [email protected][email protected]

3. Integration operator, OK, which is defined by its kernel K.x1; x2/ and converts one function to another as

jgi D OK j f i ” g.x1/ D 1̂

�1 K .x1; x2/ f .x2/ dx2

4. Rotation operator OR, which changes the orientation of a vector without changing its length

Linearity of the first three operators is evident from linearity of differentiation and integration, and the proof of linearity of rotations is a simple exercise in geometry and is left to the readers to perform.

Equation 3.9 defines an operator by its action on a ket vector. It is also possible to define an operator acting on bra vectors. One can, for instance, perform formal Hermitian conjugation of Eq. 3.9 and introduce Hermitian conjugate operator OT�:

hˇj D h˛j OT�: (3.11)

3.2 Operators in Quantum Mechanics 45

Notice that now operator OT� stands to the right of the respective bra vector but still next to its tail, “acting” to the left. Thus, Hermitian conjugation in this case involves also the change in the order, in which the participating objects are written, as well as the “direction” of “action” of the operators from right to left.

In order to help you develop intuition regarding transition between Eqs. 3.9 and 3.11, consider a linear space of column vectors—1 � N matrices. For operators you can take N � N matrices and define its action on a vector as regular matrix multiplication. For this definition to make sense from the point of view of matrix multiplication rules, the matrix must be placed to the left of the column vector. The result of this operation is another column vector:

2 66664

t11 t12 � � � t1N t21 t22 � � � t2N :::

::: : : :

:::

tN1 tN2 � � � tNN

3 77775

2 66664

a1 a2 :::

aN

3 77775

D

2 66664

b1 b2 :::

bN

3 77775 : (3.12)

The Hermitian conjugate of a column vector is a row vector (N � 1 matrix) with complex-conjugated elements. Equation 3.12 contains two column vectors, and its Hermitian conjugated version must describe the relation between two rows� a�1 a�2 � � � a�N

� and

� b�1 b�2 � � � b�N

� . However, in order to be able to multiply a square

matrix and a row vector, I must place the former to the left of the latter:

� a�1 a�2 � � � a�N

�

2 666664

t�11 t � 12 � � � t�1N

t�21 t � 22 � � � t�2N

::: ::: : : :

:::

t�N1 t � N2 � � � t�NN

3 777775

D �b�1 b�2 � � � b�N � ; (3.13)

where t�ij represents elements of a Hermitian conjugate operator matrix OT�. If I want (and I certainly do) that the relation between elements a�i and b�i expressed by Eq. 3.13 reproduce complex-conjugated relations given by Eq. 3.12, I must require that the rows of the matrix in Eq. 3.12 coincide with the complex-conjugated columns of the matrix in Eq. 3.13: t�ij D t�ji . This gives me an operational (not just formal) rule for performing the Hermitian conjugation of the matrix operator: it consists in regular matrix transposition and complex conjugation of all matrix elements. This example serves two important purposes: first, it demonstrates why reversal of the order in which vectors and operators appear after Hermitian conjugation makes sense, and, second, it yields a rule for Hermitian conjugation of a matrix.

In a general case, Eq. 3.11 does not give us any clue on how to actually generate Hermitian conjugate operators. In order to derive such a rule, I need to relate both (initial and Hermitian conjugate) operators to a quantity, which I know how to transform and which does not depend on any concrete realization of the vector space

46 3 Observables and Operators

or an operator. The only quantity of this kind, which I know of, is an inner product, and I, in order to get to it, will multiply Eq. 3.9 by a bra vector hˇj from the left. This will leave me with expression hˇj OT j˛i, which can be understood as a product of a bra vector hˇj and a ket vector OT j˛i. Complex conjugating this expression and applying Eq. 2.19, I get

hˇj OT j˛i� D h˛j OT� jˇi ; (3.14)

where I also used Eq. 3.11 to convert ket OT j˛i into a corresponding bra h˛j OT�. Equation 3.14 can be used to find a Hermitian conjugate of any particular operator as illustrated by the following examples.

Example 7 (Hermitian Conjugation) Consider differentiation operator OD acting on differentiable square-integrable functions as

OD j f i � df dx ;

Using the definition of the inner product defined by Eq. 2.21, you can present the expression in Eq. 3.14 as

hgj OD j f i � 1̂

�1 dxg�.x/

df

dx :

Integration by parts converts this expression into the following form:

� hgj OD j f i

�� D 0 @

1̂

�1 dxg�.x/

df

dx

1 A

�

D g.x/f �.x/j1�1 � 1̂

�1 dxf �.x/

dg

dx D

� 1̂

�1 dxf �.x/

dg

dx ;

where I took into account that any square-integrable functions must vanish at both positive and negative infinities. Presenting this result in the form of the right side of Eq. 3.14 h f j OD� jgi, you can identify OD� as OD� D �d=dx.

If an operator and its Hermitian conjugate coincide

hˇj OT j˛i� D h˛j OT jˇi (3.15)

or OT D OT�, the respective operator is called Hermitian or self-adjoint operator. Hermitian operators have a number of important properties, which will be discussed in more detail in Sect. 3.3. Here I shall note just one important property of Hermitian operators, which trivially follows from Eq. 3.15: a quantity defined as h˛j OT j˛i is a

3.2 Operators in Quantum Mechanics 47

real-valued number for any choice of state j˛i. Expressions of this type are called expectation values of the operator in a given state. The origin of this name will become clear in Sect. 3.3. A few examples of Hermitian operators follow below.

Example 8 (Hermitian Operators) Let me prove that operator i OD, where OD is the differentiation operator, introduced in the previous example, is Hermitian. To this end I just need to repeat computations from Example 7:

� hgj i OD j f i

�� D 0 @i

1̂

�1 dxg�.x/

df

dx

1 A

�

D g.x/f �.x/j1�1 C i 1̂

�1 dxf �.x/

dg

dx �

h f j i OD jgi :

Example 9 (Hermitian Operators) As a second example of the Hermitian operator, I consider a 3 � 3 matrix M acting on vectors in a three-dimensional vector space:

M D 2 4 1 i 2 �i 1 4i 2 �4i 0

3 5 :

I will demonstrate that this matrix is Hermitian by directly remembering that Her- mitian conjugation of matrices consists of transposition and complex conjugation. Consequently carrying out these operations, you can convince yourselves that they yield the same matrix M:

2 4 1 i 2 �i 1 4i 2 �4i 0

3 5 !

2 4 1 �i 2 i 1 �4i 2 4i 0

3 5 !

2 4 1 i 2 �i 1 4i 2 �4i 0

3 5 :

You can also compute expression a��M � a, where a is an arbitrary column vector and a� its Hermitian conjugate:

� a�1 a�2 a�3

� 2 4 1 i 2 �i 1 4i 2 �4i 0

3 5 2 4

a1 a2 a3

3 5 D �a�1 a�2 a�3

� 2 4

a1 C ia2 C 2a3 �ia1 C a2 C 4ia3 2a1 � 4ia2

3 5 D

a�1a1 C ia�1a2 C 2a�1a3 � ia�2a1 C a�2a2 C 4ia�2a3 C 2a1a�3 � 4ia2a�3 D ja1j2 C ja2j2 C ja3j2 C 2

� a�1a3 C a1a�3

�C i �a�1a2 � a�2a1 C 4a�2a3 � 4a2a�3 � :

It is obvious that the final expression is real-valued as promised.

48 3 Observables and Operators

3.2.2 Commutators, Functions of Operators, and Operator Identities

In addition to Hermitian conjugation, you will need to perform on operators other, less exotic, operations, such as multiplication. The product of two operators OT1 andOT2 is defined as consecutive action of the operators. If you consider action on a ket vector, the first operator to do the work is the one on the right:

� OT2 OT1 �

j˛i � OT2 � OT1 j˛i

� :

In the case of operators acting on the bra vector, the order is opposite: the first to act is the leftmost operator:

h˛j � OT2 OT1

� � � h˛j OT2

� OT1:

The most important property of the operator multiplication is actually the absence of a property: multiplication of operator is not, in general, commutative2:

OT2 OT1 ¤ OT1 OT2:

The non-commutative nature of operator multiplication is of extreme importance in quantum mechanics, and as you will see, it is the main mathematical feature responsible, for instance, for the uncertainty relation. For the same reason, sets of operators that do commute with each other also play an important role in the quantum formalism.

The non-commutativity of operator multiplication is expressed quantitatively via

the notion of a commutator. The commutator of two operators h OT1 O;T2

i is defined as

h OT1 O;T2 i

D OT1 OT2 � OT2 OT1: (3.16)

The knowledge of the commutator or, as it is sometimes called, a commutation relation between two operators is essential and, often, the most important informa- tion about operators that you can have. You will see throughout the course how the commutation relations of different operators are used in a variety of applications and calculations.

Commutators have a few important properties, the most frequently used of which are the following:

2We all are used to deal with commutative multiplication of numbers: the result does not depend on the order, in which multiplication is performed. The lack of commutativity of multiplication was one of the features of the Heisenberg theory, which especially freaked out Schrödinger.

3.2 Operators in Quantum Mechanics 49

h OT1 O;T2 i

D � h OT2 O;T1

i (3.17)

h OT1 C OT2; OT3 i

D h OT1 O;T3

i C h OT2 O;T3

i (3.18)

h c1 OT1; c2 OT2

i D c1c2

h OT1 O;T2 i : (3.19)

The proof of all these identities is quite obvious, and I shall leave it for you as an exercise.

Having defined a product of two operators, I can introduce a power function for the operators: OTn simply means applying the same operator n times. The power function is important because it allows defining other, more complex, functions

of the operators. In general, expression f � OT �

, where f .x/ is an arbitrary function,

which has infinitely many derivatives at x D 0, can be expended in the infinite Taylor series. Using this series one can define the operator function f

� OT �

by simply

substituting the operator instead of x in the series:

f � OT �

D 1X

nD0

1

nŠ

dnf

dxn

ˇ̌ ˇ̌ xD0

OTn:

However, a number of important functions, which you are used to dealing with routinely, cannot be defined this way and, therefore, do not make sense for operators.

Among them are p OT , ln

� OT �

and other similar functions with singularities at zero.

An important exception is function OT�1, called inverse operator, which is defined by equation

OT OT�1 D OT�1 OT D OI; (3.20)

where OI is a unity operator, i.e., an operator which does not change a vector it acts upon. The meaning of the inverse operator can be illustrated by the following expressions:

OT j˛i D jˇi OT�1 jˇi D j˛i ;

where the second line is obtained from the first one by multiplying both sides of the latter by OT�1. Finding inverse operators is usually a difficult task and often amounts to solving an entire problem. If an operator has a form of a matrix, its inverse can be found according to standard rules for inverting matrices.

Finding inverse operators is significantly simplified for a special class of operators called unitary operators. These operators, defined by the condition

OU� D OU�1;

50 3 Observables and Operators

play an extremely important role in quantum theory (we value them, of course, not just because their inverse is easy to find). The main property of unitary operators is that they do not change the norm of the vectors or their inner products. Indeed,

consider vectors j˛i and jˇi, and define new vectors j Q̨ i D OU j˛i and ˇ̌ ˇ Q̌ E

D OU jˇi, where OU is a unitary operator. Direct computation of h Q̨ j Q̌

E proves this statement:

h Q̨ j D h˛j OU� ) h Q̨ j Q̌ E

D h˛j OU� OU jˇi D h˛j OU�1 OU jˇi D h˛j ˇi :

Unitary operators are a generalization of the rotation operator acting on regular three-dimensional vectors: rotation of two vectors by the same angle does not change their lengths as well as an angle between them. As a result, the dot product of these vectors also does not change. Here is an example of a unitary operator based on the two-dimensional rotation matrix.

Example 10 (Unitary Operators) Consider the well-known matrix used to relate the coordinates of a two-dimensional vector rotated by an angle � from its initial position:

R D "

cos � � sin � sin � cos �

#

Its Hermitian conjugate is

R� D "

cos � sin �

� sin � cos �

# :

Simple computation shows that product R�R is a unity matrix:

R�R D "

cos � sin �

� sin � cos �

#" cos � � sin � sin � cos �

# D

" cos2 � C sin � � cos � sin � C cos � sin �

� cos � sin � C cos � sin � cos2 � C sin �

# D " 1 0

0 1

# :

This proves, of course, that R� D R�1. An important example of an operator function is an exponential function defined

as

exp � OT �

� 1X

nD0

1

nŠ OTn: (3.21)

3.2 Operators in Quantum Mechanics 51

Some of the familiar properties of this function remain valid even when its argument

is an operator. For instance, the derivative of the expression f .�/ D exp � � OT �

with

respect to the parameter � is calculated as though OT were a regular number:

df=d� D OT exp � � OT � :

You should be warned, however, that a very convenient property of exponential functions

exp .x C y/ D exp .x/ exp. y/ (3.22)

does not hold for operator arguments. One way to understand the reason for this unfortunate circumstance is to notice that if two operators OT1 and OT2 in the argument of the exponential function exp

� OT1 C OT2 �

do not commute, expressions

exp � OT1

� exp. OT2/ and exp

� OT2 �

exp. OT1/ are not equivalent, so they both cannot be equal to the exponential of the sum of these operators. Generalization of Eq. 3.22 to the case of operator arguments is, in general, very complicated and will not be considered here. There is, however, one case, when such a generalization has a relatively simple form and can be derived without too much efforts, while some work is still required, of course. This simplification takes place when the commutator of the operators OT1 and OT2 commutes with both of them. In most cases, this means that the commutator is a regular number, but it does not have to be.

So, suppose that the commutator of two operators OT1 and OT2 is h OT1; OT2

i D OC,

where OC is such that h OT1; OC

i D h OT2; OC

i D 0. This assumption appears to be quite

restrictive, but in reality, it is fulfilled in a great many pairs of operators that are important for quantum mechanics. In order to derive the promised generalization of Eq. 3.22, I have to, first, prove two intermediate identities, which, however, are useful in their own right. Let me begin by computing the following expression:

h OT1; e� OT2 i

D 1X

nD0

1

nŠ

h OT1; �n OT2n i

D 1X

nD0

�n

nŠ

h OT1; OT2n i : (3.23)

To proceed I need to prove the following identity for the commutators:

h OT1; OT2n i

D n OC OT2n�1: (3.24)

The easiest way to do it is to use the method of mathematical induction. For those who have forgotten how this method works, the first step is to prove the statement for the first nontrivial value of the index (n D 2 in this case). After that you assume that the statement is correct for n D k and, using this assumption, prove it for n D k C1. Thus, the first step—consider n D 2:

52 3 Observables and Operators

h OT1; OT22 i

D OT1 OT22 � OT22 OT1 D OT1 OT2 OT2 � OT2 OT1 OT2 C OT2 OT1 OT2 � OT2 OT2 OT1

D � OT1 OT2 � OT2 OT1

� OT2 C OT2 � OT1 OT2 � OT2 OT1

� D 2 OC OT2:

(Note that it works because OC commutes with OT2.) Next, n D k assumption: h OT1; OT2k

i D k OC OT2k�1

The final step—proof for n D k C 1: h OT1; OT2kC1

i D OT1 OT2kC1 � OT2kC1 OT1 D OT1 OT2kC1 � OT2 OT1 OT2k C OT2 OT1 OT2k � OT2kC1 OT1

D h OT1; OT2

i OT2k C OT2 h OT1; OT2k

i D OC OT2k C k OC OT2k D .k C 1/ OC OT2k:

Using this identity I can transform Eq. 3.23 into

h OT1; e� OT2 i

D OC 1X

nD0

�nn

nŠ OT2n�1 D OC�

1X nD1

�n�1

.n � 1/Š OT2n�1 D � OCe� OT2 (3.25)

This result can be used to derive another important identity. Multiply Eq. 3.25 by e�� OT2 from the left:

e�� OT2 h OT1; e� OT2

i D e�� OT2� OCe� OT2 :

The right-hand side of this expression simplifies to OC�: e�� OT2e� OT2 D e�� OT2C� OT2 D OI, since Eq. 3.22 is applicable for any commuting operators and any operator commutes with itself. Now you can expand the commutator on the left of the expression above to get

e�� OT2 OT1e� OT2 � OT1 D OC�

or

e�� OT2 OT1e� OT2 D OT1 C OC�: (3.26)

Now I am ready to approach my main target and to prove that

e OT1COT2 D e OT1e OT2e� 12 Œ OT1; OT2�: (3.27)

3.2 Operators in Quantum Mechanics 53

The proof of this identity is more involved than the two previous derivations. Direct proof (for instance, by using series expansions of the exponential functions on both sides of Eq. 3.27) results in expressions too cumbersome to allow for fruitful analysis. Therefore, I am going to use an indirect approach, which was invented by Harvard Professor Roy Glauber, winner of the 2005 Nobel Prize for his contribution in quantum optics. Glauber considered function f .x/ D e OxT1ex OT2 , for which he derived a differential equation by computing its derivative:

df

dx D OT1e OxT1ex OT2 C e OxT1 OT2ex OT2 :

Note how operators OT1 and OT2 are placed in this expression: OT1 appears in front of the exponent containing OT2 because it originates from the exponential function of OT1 positioned to the left of ex OT2 . At the same time, OT2 appears behind e OxT1 following the respective position of e OxT2 . Relative positions of e OxTi and respective OTi are not important because these operators commute (any operator commutes with any function of the same operator). Now the derivative can be rewritten in the following way:

df

dx D e OxT1ex OT2e O�xT2e�x OT1

� OT1e OxT1ex OT2 C e OxT1 OT2ex OT2 � :

It is not too difficult to see that the expression in front of the brackets is equal to unity so writing it there does not change anything. Continue

df

dx D f .x/

� e O�xT2 OT1ex OT2 C OT2

� D f .x/

� OT1 C x h OT1; OT2

i C OT2

� ;

where the identity given by Eq. 3.26 is used and OC is replaced with h OT1; OT2

i : This

differential equation can now be solved for function f .x/:

ˆ df

f D

ˆ dx � OT1 C OT2 C x

h OT1; OT2 i�

)

ln f

f0 D x

� OT1 C OT2 �

C 1 2

x2 h OT1; OT2

i ;

where integration constant f0 is chosen to satisfy the obvious initial condition: f .0/ D 1. With this in mind, function f can be written as

f D ex. OT1COT2/C 12 x2Œ OT1; OT2�:

Setting x D 1 in this expression and multiplying it by e� 12 Œ OT1; OT2�, Eq. 3.27 is finally obtained, completing the proof.

54 3 Observables and Operators

I want to finish this section with two important technical statements about Hermitian operators. The first one is concerned with Hermitian conjugation of a product of two Hermitian operators. It can be shown that

� OT1 OT2 �� D OT2 OT1: (3.28)

This statement can be proven as follows. By definition

˝ ˛j � OT1 OT2

�� jˇ˛ D �˝ ˇj OT1 OT2j˛

˛�� :

Introducing ˝ ˇj OT1 D

˝ Q̌j; OT2j˛ ˛ D j Q̨ ˛ and using Eq. 2.19, you can write the right-

hand side of this expression as

�˝ Q̌j Q̨ ˛ �� D ˝ Q̨ j Q̌˛:

Rules for Hermitian conjugation yield j Q̌˛ D OT�1 jˇ ˛

and ˝ Q̨ j D ˝˛j OT�2 , which allows

to proceed as follows:

�˝ ˇj OT1 OT2j˛

˛�� D �˝ Q̌ ˇ̌ Q̨ ˛

�� D ˝ Q̨ j Q̌˛ D ˝˛j OT�2 OT�1 jˇ ˛ :

By the way, you may have noticed that I had actually proved a more general statement. Indeed, the last equation means that

� OT1 OT2 �� D OT�2 OT�1 ;

which is valid for any linear, not necessarily Hermitian, operator. Equation 3.28 follows from this result if OT1 and OT2 are Hermitian. An immediate corollary of this result is the following:

h OT1; OT2 i� D �

h OT2; OT1 i : (3.29)

Operators which change sign upon Hermitian conjugation are called anti-Hermitian, so the commutator of two Hermitian operators is also anti-Hermitian. It is now easy to demonstrate that a commutator of two Hermitian operators can be presented as

h OT1; OT2 i

D i OA; (3.30)

3.2 Operators in Quantum Mechanics 55

where OA is Hermitian. If the commutator is a number, Eq. 3.30 is reduced to h OT1; OT2

i D ic; (3.31)

where c is real.

3.2.3 Eigenvalues and Eigenvectors

When an operator acts on a generic vector, the result is a different vector. For instance, differentiation operator acting on function e�x2 : ODe�x2 D �2xe�x2— produces a different function. If, however, you apply the same operator to function e x, the result will be the same function, multiplied by a number: ODe x D e x. This example illustrates a general phenomenon: among many vectors that are changed by operators in completely different vectors, there are some that are only being multiplied by a number. This special class of vectors, called eigenvectors, plays an important role in the application of operators in quantum physics. The number, which appears as a factor in front of an eigenvector, is specific for each vector (or a limited subset thereof) and is called an eigenvalue. The formal definition of an eigenvector and an eigenvalue is as follows: vector j˛i is an eigenvector of operator OT with a respective eigenvalue �˛ if

OT j˛i D �˛ j˛i : (3.32)

For each eigenvector there might be one and only one corresponding eigenvalue, but the opposite of this statement is not always true. If for each eigenvalue there exists only a single eigenvector, we describe this eigenvalue as non-degenerate. If an opposite happens, and several eigenvectors “belong” to the same eigenvalue, the respective eigenvalue is naturally called “degenerate.” In the non-degenerate case, an eigenvalue describes a respective eigenvector with an accuracy to a constant factor (a vector appearing in Eq. 3.32 can be multiplied by any number without destroying the equation). If we, however, require that all eigenvectors be normalized, then the eigenvalue will define the respective eigenvector uniquely (with accuracy to an arbitrary phase factor, which cannot be fixed by normalization but which does not affect any physical results) so that I can designate it simply as j�i.

To distinguish between different eigenvectors belonging to the same eigenvalue, I need an additional index so that Eq. 3.32 becomes

OT j�;�i D � j�;�i : (3.33)

The physical meaning of the additional index will become clear later, but for now, it is just a way to distinguish between different eigenvectors belonging to the same eigenvalue. An important property of degenerate eigenvectors is that any

56 3 Observables and Operators

linear combination of these vectors is again an eigenvector belonging to the same eigenvalue. Indeed, consider a vector

j˛i D a�1 j�;�1i C a�2 j�;�2i

and apply operator OT to it OT j˛i D OT �a�1 j�;�1i C a�2 j�;�2i

� D a�1� j�;�1i C a�2� j�;�2i D � j˛i

where I used Eq. 3.33. Using mathematical lingo, you can say that eigenvectors belonging to a degenerate eigenvalue form a subspace of the total linear space because by forming any linear combination thereof you remain within the same set of vectors in complete agreement with the definition of a vector space.

Now I shall prove an important theorem concerning eigenvectors of commuting operators and discuss its consequences.

Theorem 1 (Eigenvectors of Commuting Operators) Consider two operators OT1 and OT2 such that OT1 OT2 D OT2 OT1. Also assume that �T1 is a non-degenerate eigenvalue of OT1 with eigenvector j�T1i. Then, this vector is also an eigenvector of the operatorOT2. Proof Consider

OT2 OT1 j�T1i D �T1 OT2 j�T1i D OT1 OT2 j�T1i

where at the last step I used the commutative property of the operators. The obtained result means that OT2 j�T1i is also an eigenvector of OT1 with the same eigenvalue �T1 . However, since it was assumed that �T1 is non-degenerate, this new eigenvector might differ from j�T1i only by a constant factor:

OT2 j�T1i D �T2 j�T1i ;

which means that j�T1i is an eigenvector of OT2. The non-degenerate nature of the eigenvalue of OT1 is essential for this proof to

work. Thus, if eigenvalues of OT1 are degenerate, not all eigenvectors of OT1 will also be eigenvectors of OT2. However, it can be proven (though the proof is much more involved and will not be reproduced here) that one can always form such a linear combination of these degenerate eigenvectors which will become an eigenvector of OT2, with its own eigenvalue �T2 . In this case, assigning eigenvalues of both OT1 andOT2 might provide a unique characterization of a vector, which is a simultaneous eigenvector of both operators and can be notated as j�T1 ; �T2i. Comparing this notation to Eq. 3.33, one can see that the index � in that equation can be understood as an eigenvalue of a commuting partner operator. If there exists a third operator, OT3, commuting with both OT1 and OT2, one can find common eigenvectors for all three

3.2 Operators in Quantum Mechanics 57

operators, in which case a full unique characterization of such a state would require specifying three eigenvalues: j�T1 ; �T2 ; �T3i, where

OT1 j�T1 ; �T2 ; �T3i D �T1 j�T1 ; �T2 ; �T3i OT2 j�T1 ; �T2 ; �T3i D �T2 j�T1 ; �T2 ; �T3i OT3 j�T1 ; �T2 ; �T3i D �T3 j�T1 ; �T2 ; �T3i :

In general, in order to fully uniquely characterize an eigenvector of an operator with degenerate eigenvalues, one needs to find the complete set of commuting operators (CSCO), i.e., all operators which commute with each other.

To help you visualize these rather abstract concepts, I will illustrate them with a simple example involving commuting matrices, but you have to be prepared for some lengthy computations. So, embrace yourself! This example will also illustrate the process of finding eigenvalues and eigenvectors of operators in a matrix form.

Example 11 (Eigenvectors of Commuting Matrices) Consider two 3 � 3 matrices

M1 D

2 664

5 4

1

2 p 2

1 4

1

2 p 2

3 2

1

2 p 2

1 4

1

2 p 2

5 4

3 775 I M2 D

2 664

1 � 1p 2

�1 � 1p

2 0 � 1p

2

�1 � 1p 2

1

3 775 : (3.34)

It does not take much effort to compute their products (you can use symbolic computational platform such as Mathematica or Maple if you are too lazy to do it yourself) and to see that the matrices, indeed, commute:

M1 � M2 D M2 � M1 D

2 664

3 4

� 3 2 p 2

� 5 4

� 3 2 p 2

� 1 2

� 3 2 p 2

� 5 4

� 3 2 p 2

3 4

3 775 :

Vectors in this case are single columns with three elements:

j˛i D

2 64

u1 u2 u3

3 75 :

and the eigenvector equation 3.32 takes the form of a matrix equation. For M1 this equation is

2 64

5 4

1

2 p 2

1 4

1

2 p 2

3 2

1

2 p 2

1 4

1

2 p 2

5 4

3 75

2 64

u1 u2 u3

3 75 D �

2 64

u1 u2 u3

3 75 :

58 3 Observables and Operators

It is convenient to collect all terms on one side and present this equation in the form

2 664

5 4

� � 1 2 p 2

1 4

1

2 p 2

3 2

� � 1 2 p 2

1 4

1

2 p 2

5 4

� �

3 775

2 664

u1

u2

u3

3 775 D 0: (3.35)

What we have here is a matrix form of a system of three linear homogeneous equations, which always has at least one solution: u1 D u2 D u3 D 0. This solution, however, is not what I had in mind when introducing the concept of eigenvectors. We need non-zero solutions, but they might exist only if the determinant of the matrix representing coefficients of this equation is equal to zero. (Cramer’s rule of linear algebra, anyone?) Computing the determinant and setting it to zero, I arrive at the following equation:

�3 � 4�2 C 5� � 2 D 0;

which has three solutions, �1;2 D 1I �3 D 2, two of which coincide signifying that the matrix does have degenerate eigenvalues. (These solutions can be found by factoring the determinant as .� � 1/2 .� � 2/.)

Now, for each eigenvalue, I will find a respective eigenvector, beginning with a non-degenerate eigenvalue �3 D 2. Substituting this eigenvalue in Eq. 3.35, I reduce it to

2 664

� 3 4

1

2 p 2

1 4

1

2 p 2

� 1 2

1

2 p 2

1 4

1

2 p 2

� 3 4

3 775

2 664

u.3/1

u.3/2

u.3/3

3 775 D 0:

where added upper index in u.3/i indicates that this eigenvector belongs to the third eigenvalue. Expanding the matrix equation in an explicit system of linear equations yields

�3 4

u.3/1 C 1

2 p 2

u.3/2 C 1

4 u.3/3 D 0 ) �3u.3/1 C

p 2u.3/2 C u.3/3 D 0

1

2 p 2

u.3/1 � 1

2 u.3/2 C

1

2 p 2

u.3/3 D 0 ) u.3/1 � p 2u.3/2 C u.3/3 D 0

1

4 u.3/1 C

1

2 p 2

u.3/2 � 3

4 u.3/3 D 0 ) u.3/1 C

p 2u.3/2 � 3u.3/3 D 0:

Combining the last two equations, I get 2u.3/1 � 2u.3/3 D 0 ) u.3/1 D u.3/3 . Then, the first two equations are reduced to two identical equations:

3.2 Operators in Quantum Mechanics 59

�2u.3/1 C p 2u.3/2 D 0

2u.3/1 � p 2u.3/2 D 0;

which means that the value for one of the coefficients u.3/1;2 can be chosen arbitrarily.

For instance, you can express these coefficients in terms of yet undefined u.3/1 :

u.3/2 D p 2u.3/1 I u.3/3 D u.3/1 . Using notation j2i to designate this eigenvector (2

in this notation refers to the value of the respective eigenvalue), I can write

j2i D u.3/1

2 64 1p 2

1

3 75 :

The value of the remaining coefficient can be fixed (if the undefined coefficients make you nervous) by requiring that the vector is normalized:

ˇ̌ ˇu.3/1

ˇ̌ ˇ 2 � 1

p 2 1 � 2 664 1p 2

1

3 775 D

ˇ̌ ˇu.3/1

ˇ̌ ˇ 2

4 D 1 ) u.3/1 D 1

2 :

Thus, the normalized eigenvector belonging to the eigenvalue � D 2 is found to be

j2i D 1 2

2 664 1p 2

1

3 775 : (3.36)

Now let me deal with degenerate eigenvalue �1;2 D 1. In this case, the eigenvector equation becomes

2 664

1 4

1

2 p 2

1 4

1

2 p 2

1 2

1

2 p 2

1 4

1

2 p 2

1 4

3 775

2 664

u.1;2/1

u.1;2/2

u.1;2/3

3 775 D 0

or in the expanded form

1

4 u.1;2/1 C

1

2 p 2

u.1;2/2 C 1

4 u.1;2/3 D 0 ) u.1;2/1 C

p 2u.1;2/2 C u.1;2/3 D 0

1

2 p 2

u.1;2/1 C 1

2 u.1;2/2 C

1

2 p 2

u.1;2/3 D 0 ) u.1;2/1 C p 2u.1;2/2 C u.1;2/3 D 0

60 3 Observables and Operators

1

4 u.1;2/1 C

1

2 p 2

u.1;2/2 C 1

4 u.1;2/3 D 0 ) u.1;2/1 C

p 2u.1;2/2 C u.1;2/3 D 0:

In this case, all three equations coincide, meaning that I can choose arbitrarily two coefficients, e.g., u.1;2/1 and u

.1;2/ 3 , while expressing the remaining coefficients as

u.1;2/2 D � �

u.1;2/1 C u.1;2/3 � = p 2:

Choosing different values of the remaining coefficients, I can generate different eigenvectors all belonging to the same eigenvalue. For instance, choosing u.2/3 D 0 and u.1/1 D 0, I generate distinct vectors:

j1i1 D u.2/3

2 664

0

� 1p 2

1

3 775 I j1i2 D u

.1/ 3

2 664

1

� 1p 2

0

3 775 ; (3.37)

which can also be normalized. Any linear combination of these vectors will also be an eigenvector.

Now I turn my attention to matrix M2. Again computing the determinant

��������

1 � � � 1p 2

�1 � 1p

2 �� � 1p

2

�1 � 1p 2 1 � �

��������

and setting it to zero, I end up with the equation

�3 � 2�2 � �C 2 D 0

which again can be solved by factorization and yields �1 D 2; �2 D �1; �3 D 1. Each of these eigenvalues (which, by the way, are non-degenerate) has its own eigenvector, which can be found in the same way as above. I will leave the actual calculations as an exercise and present here only the final answers for the normalized eigenvectors:

j2i D 1p 2

2 664

�1 0

1

3 775 ; j�1i D

1

2

2 664 1p 2

1

3 775 ; j1i D

1

2

2 664

1

�p2 1

3 775 ; (3.38)

where eigenvectors are again labeled by their respective eigenvalues. Now, it is obvious that eigenvector j�1i of matrix M2 is also an eigenvector of M1, so I only need to check the remaining vectors:

3.3 Operators and Observables 61

2 664

5 4

1

2 p 2

1 4

1

2 p 2

3 2

1

2 p 2

1 4

1

2 p 2

5 4

3 775

2 664

�1 0

1

3 775 D

2 664

�1 0

1

3 775 ;

so this vector is an eigenvector of M1 with eigenvalue � D 1. Note that the elements of this vector obey condition u.1;2/2 D �

� u.1;2/1 C u.1;2/3

� = p 2 derived for

the degenerate eigenvectors of M1 with u2 D 0, u1 D �u3 D 1. Now, for the remaining eigenvector of M2, I have

2 664

5 4

1

2 p 2

1 4

1

2 p 2

3 2

1

2 p 2

1 4

1

2 p 2

5 4

3 775

2 664

1

�p2 1

3 775 D

2 664

1

�p2 1

3 775 ;

i.e., this is also an eigenvector of M1 with the same eigenvalue. For this vector I also have u2 D � .u1 C u3/ =

p 2 with u1 D u3 D 1. Thus, I can present

the system of common eigenvectors of these two matrices, in which degenerate eigenvectors become uniquely defined by the virtue of their belonging to the eigenvalues of a second commuting matrix. Now, all these eigenvectors can be designated as j1; 2i ; j1; 1i, and j2;�1i, where the first and second numbers refer to the eigenvalues of M1 and M2, respectively.

3.3 Operators and Observables

3.3.1 Hermitian Operators

One might notice a striking similarity between CSCO and the concept of the complete set of mutually consistent observables discussed in Sect. 2.1. Also, the state vectors characterized by definite values of compatible observables look like eigenvectors of operators characterized by eigenvalues of commuting operators. It appears reasonable, therefore, to expect that one can establish a connection between physical observables and quantum states characterized by the values of the observables on one hand and the mathematical concepts of operators and their eigenvalues and eigenvectors on the other hand. This connection is indeed established by the following postulates laying down the foundation of formalism of quantum mechanics.

Postulate 1 (Observables and Hermitian Operators) Every observable is represented in quantum theory by a Hermitian operator.

62 3 Observables and Operators

Postulate 2 Eigenvalues of operators constructed to represent an observable determine values, which a measurement of the observable might yield, and eigenvectors define states, in which a measurement of the observable represented by the operator will with certainty produce the corresponding value.

The first question which might pop up in someone’s mind after reading the first of these postulates is, why does it single out Hermitian operators? The fact of the matter is that Hermitian operators possess a number of special properties, which make them practically suitable for their intended use as representative of physical observables. These properties can be formulated in the form of several theorems.

Theorem 2 (Theorem of the Eigenvalues) Eigenvalues of Hermitian operators with discrete spectrum are necessarily real-valued.

Proof Let j�ni be an eigenvector of a Hermitian operator OT corresponding to eigenvalue �n:

OT j�ni D �n j�ni :

Premultiplying this expression by h�nj, I get

h�nj OT j�ni D �n h�n j�ni :

Performing complex conjugation of this expression and using the definition of the Hermitian conjugate operator, Eq. 3.14, I derive

� h�nj OT j�ni

�� D h�nj OT� j�ni D ��n h�nj�ni

where it is assumed that the norm of the vector exists and is a real-valued quantity. For Hermitian operators OT� D OT , in which case left-hand sides of the last two equations coincide yielding ��n D �n, which means, of course, that �n is a real number.

The importance of this theorem for association between physical observables and operators is obvious—results of any measurements are always expressed by real numbers, and the theorem guarantees that the mathematical constructs (eigenvalues) used to connect the formalism with the real world of experiments and observations are consistent with this natural requirement. The assumption that the norm of the respective eigenvectors exists, which is a critical element of the proof of the theorem, can be rigorously validated only for Hermitian operators with discrete spectrum.3

Eigenvectors of operators with continuous spectrum are not normalizable in the usual sense (see Sect. 2.3), so this theorem does not apply to them. At the same time, we need such continuous spectrum operators as momentum or coordinate

3I borrowed this fact without proof from the branch of mathematics called functional analysis that studies the properties of linear operators.

3.3 Operators and Observables 63

to describe physical reality, so we have to find a way to avoid having to deal with unrealistic complex eigenvalues. Leaving the mathematical intricacies of this problem to mathematicians, I solve it here by a sleight of hand. I simply postulate that only real eigenvalues and their corresponding eigenvectors of such operators can be used to represent quantum states and the results of measurements. It can be shown that the eigenvectors corresponding to real eigenvalues of Hermitian operators with continuous spectrum can be normalized in the sense of Eq. 2.41. To illustrate the last point, consider operator id=dx that I have previously proved to be Hermitian. The eigenvectors of this operator have the form of e�ikx, with k being an eigenvalue:

id.e�ikx/=dx D ke�ikx:

If I force k to be a real number, I can use the properties of the delta-function to write

ˆ dxeix.k�k1/ D 2�ı .k � k1/ ;

which is the orthonormalization requirement for the eigenvectors belonging to continuous spectrum. You may want to notice that the integral in this expression is reduced to the delta-function only for real-valued k.

Theorem 3 (Theorem of Eigenvectors) Eigenvectors of Hermitian operators with discrete spectrum belonging to different eigenvalues are necessarily orthogonal.

Proof Consider two different eigenvalues �1 and �2 of a Hermitian operator OT together with their eigenvectors j�1i and j�2i:

OT j�1i D �1 j�1i OT j�2i D �2 j�2i :

Premultiply first of these equations by h�2j and the second one by h�1j:

h�2j OT j�1i D �1 h�2 j�1i h�1j OT j�2i D �2 h�1 j�2i :

Complex conjugate the second of these equations, use Eq. 3.14 (which defines the Hermitian conjugate operator), and take into account that OT is Hermitian. This yields

h�2j OT j�1i D .�2/� h�1 j�2i� :

so that the pair of equations from above can be written as

h�2j OT j�1i D �1 h�2 j�1i h�2j OT j�1i D .�2/� h�1 j�2i�

64 3 Observables and Operators

Taking into account that the eigenvalues of the Hermitian operators are real and that according to the property of the inner product h�2 j�1i D h�1 j�2i�, you finally obtain

�1 h�2 j�1i D �2 h�2 j�1i :

If �1 ¤ �2, you have no choice but to conclude that h�2 j�1i D 0. In the case of Hermitian operators with degenerate spectrum, the situation is more

complex because, as we saw in the matrix example in Sect. 3.2.3, one can generate multiple sets of linearly independent vectors belonging to the same eigenvalue, and they do not have to be orthogonal. At the same time, we also saw that one can always find such a set, in which eigenvectors are orthogonal. These special sets of orthogonal vectors belonging to the degenerate eigenvalues are usually also eigenvectors of another operator from the respective CSCO. Thus, you can be rest assured that for any Hermitian operator, there exists a set of mutually orthogonal eigenvectors. I already mentioned that the physical meaning of the mathematical concept of orthogonality is mutual exclusivity of values of the observables used to characterize the states, and this comment essentially completes our identification of mutually exclusive states characterized by a set of mutually consistent set of observables with eigenvectors of operators belonging to a complete set of mutually commuting operators.

Theorem 4 (Completeness of Eigenvectors) The set of eigenvectors of Hermitian operators is complete in a sense that any state in the respective Hilbert vector space can be presented as a linear combination of these eigenvectors.

The completeness property gives a rigorous mathematical justification to the generalization of the superposition principle expressed by Eq. 2.26. This property essentially states that eigenvectors of Hermitian operators with discrete spectrum form a countable basis in the Hilbert vector space. It can also be expressed in the form of a so-called completeness or “closure” relation, which can be presented as a useful operator identity. To derive it, I, first, rewrite Eq. 2.26 in a more compact form as

j˛i D X

n

an j�ni ; (3.39)

where index n enumerates the eigenvectors and each eigenvector j�ni, which is assumed to be normalized, is characterized by all available eigenvalues of the respective CSCO. Expansion coefficients an in this expression can be found as an D h�nj˛

˛ as established in Eq. 2.24. After substitution of this expression back

into Eq. 3.39, the latter becomes

j˛i D X

n

j�ni h�nj˛ ˛ �

X n

j�ni h�nj ! ˇ̌ ˛ ˛ : (3.40)

3.3 Operators and Observables 65

In the last expression here, I split off ket vector j˛i from the bra h�nj and combined the latter with another ket j�ni. The ket and bra vectors enclosed in the brackets are in unusual positions: the bra is on the left of the ket, which is opposite to their regular positions in the standard inner product. As you can guess, expression OP.n/ D j�ni h�nj is not an inner product, but does it have any sensible meaning at all? In the matrix example of the vectors, this expression corresponds to the situation in which the column vector is written down to the left of the row vector— the arrangement used to form the outer or tensor product mentioned in the previous section. Respectively, in the case of abstract generic ket and bra vectors, j�ni h�nj can be understood as an outer product of two vectors. Naturally, just as the outer product of rows and columns yields a matrix, the outer product of bras and kets generates an operator: indeed, if you bring the split-off ket vector back, you can construct the following expression:

OP.n/ ˇ̌˛˛ D j�ni h�nj˛ ˛ : (3.41)

i.e., the result of the action of OP.n/ on j˛i is vector j�ni multiplied by a number. If j˛i and j�ni were a regular three-dimensional vector and one of the unit vectors specifying a particular direction correspondingly, you could say that OP.n/ projects j˛i on j�ni and generates a component of j˛i in the direction specified by j�ni. It is customary to maintain the same terminology and call operator OP.n/ a projection operator.

Example 12 (Projection Operators) To get accustomed to working with operators of the form OP.n/ D j�ni h�nj, let me prove the main property of the projection operators,

h OP.n/ i2 D OP.n/:

h OP.n/ i2 D j�ni h�nj �ni h�nj :

The expression in the middle looks like an inner product of a basis vector with itself, and as such it is equal to unity. Thus, we have

h OP.n/ i2 D j�ni h�nj D OP.n/:

The expression inside the parentheses in Eq. 3.40 is a sum of projection operators, but most importantly, it is easy to see that this sum is identical to a unity operator: it acts on vector

ˇ̌ ˛ ˛

and generates the same vector. This statement can be written as the following identity:

X n

j�ni h�nj D OI; (3.42)

which is the completeness or closure relation. This is a useful operator identity, which will be frequently used in what follows.

66 3 Observables and Operators

Not all vector spaces used in quantum mechanics can be described by a discrete basis, and sometimes we have to use as a basis eigenvectors of operators with continuous spectrum. I have already discussed this possibility in Sect. 2.3 using states characterized by a definite value of particle’s position jri. Now you can associate these states with eigenvectors of a position operator Or. In general, if jqi is an eigenvector of some Hermitian operator with continuous spectrum and q is the respective eigenvalue, you can present an arbitrary state j˛i as an integral instead of a sum:

j˛i D ˆ

dq .q/ jqi : (3.43)

Premultiplying Eq. 3.43 by bra hq1j and using the orthogonality condition for continuous spectrum, Eq. 2.41, you will obtain

hq1 j˛i D ˆ

dq .q/ hq1 jqi D ˆ

dq .q/ı .q1 � q/ D .q1/: (3.44)

Replacing .q/ in Eq. 3.43 with its expression derived in Eq. 3.44, you end up with

j˛i D ˆ

dq jqi hq j˛i :

Considering expression ´

dq jqi hqj as an operator, you can, similarly to the case of discrete basis, write

ˆ dq jqi hqj D OI: (3.45)

Equation 3.45 constitutes a completeness condition for eigenvectors of operators with continuous spectrum.

Example 13 (Expansion in Terms of Continuous Basis) To illustrate Eq. 3.43, consider again a linear vector space of integrable functions of a single variable: j˛i � f .x/. The Fourier transform of this function can be defined as

f .x/ D 1p 2�

1̂

�1 dkQf .k/eikx;

where the “coefficient” function Qf .k/ is defined via the inverse transform

Qf .k/ D 1p 2�

1̂

�1 dxf .x/e�ikx:

3.3 Operators and Observables 67

The role of the continuous basis is played here by functions

jki � 1p 2�

eikx;

which are eigenvectors of Hermitian operator �id=dx with continuous spectrum consisting of real numbers k. These eigenvectors are orthogonal and delta-function normalized:

1

2�

1̂

�1 dxei.k1�k/x D ı .k � k1/ :

The completeness condition, Eq. 3.45, for these functions takes the form of

1

2�

1̂

�1 dkei.x1�x/k D ı .x � x1/ ;

with delta-function ı .x � x1/ playing the role of the identity operator OI in this space: ˆ

f .x/ı .x � x1/ dx D f .x1/:

Some operators have a mixed spectrum: it is discrete for one range of eigenvalues and continuous for another range. Completeness relation in this case will be a combination of Eq. 3.42 and Eq. 3.45 with sum over all discrete eigenvectors and the integral over the continuous one.

3.3.2 Quantization Postulate

Most physical observables can be constructed from just two elements: position vector r and momentum p. I have already introduced states with definite values of the position vector, jri, which are supposed to be eigenvectors of a respective Hermitian operator Or. Similarly, I can introduce states with definite values of momentum jpi, which are supposed to be eigenvectors of the Hermitian momentum operator Op. The first question, of course, which you shall want to know is what these operators do to quantum states. You could have guessed the answer for the states represented by eigenvectors of respective operators: Or jQri D Qr jQri , Op jQpi D Qp jQpi, where I placed above r and p to better distinguish between symbols of respective operators and their eigenvalues and eigenvectors. Using these results I can compute expressions like Or j˛i or Op j˛i by expanding the state j˛i in terms of eigenvectors of the respective operators. For instance, by presenting

j˛i D ˆ

dQr .Qr/ jQri ;

68 3 Observables and Operators

I can find

Or j˛i D ˆ

dQr .Qr/ Or jQri D ˆ

dQr .Qr/ Qr jQri :

Similar treatment for the momentum operator yields

Op j˛i D ˆ

d Qp' .Qp/ Op jQpi D ˆ

d Qp' .Qp/ Qp jQpi :

The problem arises when both position and momentum operators appear in the same expression and we have to figure out how to operate, say, Op, on a state expanded in terms of eigenvectors of Or or vice versa. I will discuss this issue later in the book, in the section devoted to “representations” of the state vectors and operators. For now I would just like to say that the solution to this problem depends on the fundamental assumptions about commutation relations involving position and momentum operators. Essentially, the quantization procedure, i.e., the rules determining how to replace classical observables with their representation as quantum operators, consists in the postulation of these commutation relations. You will see many times in this text that the knowledge of the commutators of various operators is all what you need to know to perform quantum mechanical calculations. So, please meet the fundamental commutation relations of quantum mechanics.

Postulate 3 (Quantization Postulate) Operators, corresponding to various Cartesian components of position vector and momentum, obey the following commutation relations:

�Ori; Orj � D 0I �Opi; Opj

� D 0 (3.46) �Ori; Opj

� D i„ıi;j; (3.47)

where subindexes take values 1; 2; 3 indicating x; y; z Cartesian components of the position and momentum vectors, respectively.

The first of the commutators in Eq. 3.46 indicates that the Cartesian components of the position vectors are mutually consistent observables. In other words, it means that if a system is in the state with a certain position, all three components of the position vector are well-defined. The same is true for the vector of momentum as expressed by the second of the commutators in Eq. 3.46. These commutators reflect our desire born out of empirical experience for the position and momentum of the quantum systems to be genuinely well-defined quantities, at least when measured independently of each other.

The commutators presented in Eq. 3.47 are often called canonical commutation relations, and they also express our empiric experience, namely, the fact that the same Cartesian components of position and momentum vectors of a quantum system are not mutually consistent observables and cannot, therefore, be described by

3.3 Operators and Observables 69

commuting operators. The actual form of the commutator is chosen to reproduce Heisenberg’s uncertainty principle, which is discussed in the next section. You will also see later that the empirical foundation for this form of the commutator can be traced to the de Broglie relation, Eq. 1.3. It is interesting to note a striking similarity between commutators given in Eqs. 3.46 and 3.47 and canonical Poisson brackets of classical mechanics, Eq. 3.5. This similarity lies in the foundation of the so-called canonical quantization rule: any classical conjugated quantities satisfying Eq. 3.5 in quantum theory are promoted to quantum operators obeying the canonical commutation relation 3.47. Therefore, canonically conjugated variables never belong to the same class of mutually consistent observables and are found on the opposite sides of the Bohr complementarity principle.

3.3.3 Constructing the Observables: A Few Important Examples

Using coordinate and momentum operators, I can construct operators for other observables, which is done according to the standard quantization rule.

Quantization Rule To turn a classical observable into an operator, replace all coordinate and momentums appearing in its classical definition with corre- sponding operators respecting the requirements of hermiticity and the order of multiplication, when necessary.

In many situations, the issues related to hermiticity or to the multiplication order of observables are resolved automatically, but in some cases one needs to pay special attention to them. To have you started, consider several simplest examples.

Kinetic Energy Kinetic energy of a single particle with mass me is described by operator

OK D Op 2

2me ;

which is obtained from the corresponding classical expression by replacing classical momentum with the momentum operator. The eigenvectors of this operator coincide with the eigenvectors of the momentum operator, and its eigenvalues, which form a continuous spectrum, provide values of kinetic energy that can be observed for a system under study.

Potential Energy Potential energy is obtained from the respective classical potential energy func- tion by replacing classical coordinate argument of the function with its operator equivalent: U.r/ ! U .Or/. It is assumed here, of course, that the potential energy function can be presented as a series of positive and negative powers of r, in

70 3 Observables and Operators

which case the corresponding operator expression would have an easily identifiable meaning. Examples of such transformations are one-dimensional harmonic potential (kx2 ! kOx2) and Coulomb potential (k=r ! kOr�1), where r is the absolute value of the position vector.4 The eigenvectors of this operator are the same as of the position operator, and the respective eigenvalues determine the possible values of the potential energy of the system.

Hamiltonian Hamiltonian, which in classical mechanics is defined as the energy of the system expressed in terms of canonically conjugated coordinate and momentum, in quan- tum mechanics becomes, in a single particle case, an operator of the form

OH D Op 2

2m C U .Or/ : (3.48)

Since position and momentum operators do not commute, the eigenvectors of the Hamiltonian are usually different from the eigenvectors of both position and momentum operators. Eigenvalues of Hamiltonian can belong to discrete, continuous, or mixed spectrum and determine the values of energy, which the system can have in the given environment. This is the most important operator in all of the quantum physics: just like classical Hamiltonian, its quantum counterpart controls the dynamics of the quantum objects.

Angular Momentum Angular momentum is a very special kind of an observable. Classical angular momentum is a vector defined as a cross product of the position and momentum operators L D r � p. The quantization rule requires that the quantum mechanical angular momentum operator is constructed by promoting position and momentum vectors to the corresponding operators:

OL D Or � Op: (3.49)

However, since this expression involves the product of the potentially non- commuting operators, one has to be careful with the order of the multiplication. One also needs to make sure that the resulting operator is Hermitian. To address both these concerns, I will expand the angular momentum vector in its Cartesian components:

OLx D OyOpz � OzOpy (3.50)

4This transformation is not as trivial as it might seem since taking absolute value of a vector involves operation of square root, which is not well defined for operators. Practically it is not a problem, however, because usually one works in the basis of the eigenvectors of the position operator, in which case Or�1 becomes simply 1=r. If you are not concerned with any of this, this note is not for you. I mention it here simply in order to avoid accusations in sweeping something under the rug.

3.3 Operators and Observables 71

OLy D OzOpx � OxOpz (3.51) OLz D OxOpy � OyOpx: (3.52)

(One can use as a useful mnemonic device representation of the vector product as a determinant:

r� p �

�������

ex ey ez x y z

px py pz

������� ;

where the first line is formed by unit vectors defining corresponding axes of a Cartesian coordinate system.)

The first thing to notice in Eqs. 3.50–3.52 is that operators that are actually being multiplied correspond to commuting components of the position and momentum vectors; thus, the order, in which you place these operators, is not important. Next, you need to verify that each of the components of the angular momentum operator is a Hermitian operator. Hermitian conjugation, e.g., on x-component yields

OL�x D .OyOpz/� � �OzOpy

�� D Opz Oy � OpyOz D OLx proving hermiticity of this operator. Similarly, you can demonstrate the Hermitian nature of two other components. The most unusual property of the angular momentum, however, is that different components of the angular momentum do

not commute. To illustrate this point, compute commutator h OLx; OLy

i :

h OLx; OLy i

D �OyOpz � OzOpy � .OzOpx � OxOpz/ � .OzOpx � OxOpz/

�OyOpz � OzOpy � D

OyOpzOzOpx C OzOpy OxOpz � OzOpyOzOpx � OyOpz OxOpz � OzOpx OyOpz � OxOpzOzOpy C OzOpxOzOpy C OxOpz OyOpz D

OyOpx OpzOz C Opy OxOzOpz ����Oz2 Opy Opz ���Op2z OxOy � OyOpxOzOpz � Opy OxOpzOz C�� �Oz2 Opy Opz C��Op2z OxOy D

OyOpx .OpzOz � OzOpz/C Opy Ox .OzOpz � OpzOz/ D i„ �Opy Ox � OyOpx

� D i„OLz; (3.53)

where, when transitioning from the second line to the third, I took into account that different components of the coordinate and momentum operators do commute, so that their order can be changed at will. Similarly, you will find (do it!)

h OLz; OLx i

D i„OLy (3.54) h OLy; OLz

i D i„OLx: (3.55)

72 3 Observables and Operators

These results indicate that the vector of the angular momentum in quantum theory is quite different from regular classical vectors as well as from vector operators of position and momentum: different components of this vector do not belong to the same group of mutually commuting operators and do not represent mutually consistent observables, meaning that this vector is not really well-defined. More specifically, if a quantum system is in a state in which one of the Cartesian components of the angular momentum is known with certainty, measurements of two other components will produce statistically uncertain results. This conclusion, in addition to making the direction of the angular momentum vector uncertain, also raises a question about its magnitude. Indeed, the magnitude of a generic classical

3-D vector is defined as jAj D q

A2x C A2y C A2z . Formal quantization of this expression is not possible because the square root of an operator

q OA2x C OA2y C OA2z

is not a well-defined object. In the case of position and momentum operators, this problem did not arise because different components of these operators are commuting so that one can always choose a coordinate system in which all but one component of the position or momentum operators are equal to zero. The possible values of the remaining non-zero component will define the magnitude of the entire vector. This approach is not possible in the case of angular momentum because of the incompatibility of its components. This problem is circumvented by choosing the operator of the square of angular momentum defined as

OL2 D OL2x C OL2y C OL2z (3.56)

to represent its magnitude. Computing commutators h OL2;Lx;y;z

i you will find that all

three commutators vanish. (The proof of this statement is left to you as an exercise.) This means that operators of the square of the angular momentum and one (any) component of the angular momentum are compatible observables, so that a quantum system can be created in a state in which one of the components and the magnitude of the angular momentum are known with certainty. Obviously such a state would be a common eigenvector of OL2 and OLz. Quantization of p � r As a last example, consider a classical expression of the form p � r, which appears in some applications. An attempt to directly transform this expression in the quantum form by promoting the momentum and position vectors to operators faces two obstacles. First, the operators in this expression do not commute, and so it is unclear what is the correct order of multiplication. Second, even if I arbitrarily impose a particular order, say, Op � Or, the resulting operator is not Hermitian because .Op � Or/� D Or � Op ¤ Op � Or. To carry out the quantization procedure in this case, you need to come up with an expression, which would coincide with its original classical version but would not depend on the order of the operators, and be Hermitian. One way to achieve this is to introduce operator

1

2 .Op � Or C Or�Op/

3.3 Operators and Observables 73

which satisfies all these conditions. However, this quantization procedure is not unique, and it might (and does) create problems down the road, but luckily for us this is not the road I choose for us to travel.

3.3.4 Eigenvalues of the Angular Momentum

The operators of the angular momentum play an extraordinary role in quantum theory, both on the fundamental level and for applications. The fundamental role of the angular momentum is derived from its relation to the rotation operator and rotational symmetry of quantum systems, but discussion of this topic is well above your pay grade. Those interested in the topic are free to consult any graduate level quantum mechanics text. From the point of view of applications, the importance of the angular momentum stems from the fact that many fundamental interactions in nature are described by so-called central potentials. The potential energy of such interactions depends only on the absolute value of the distance between two interacting particles, but not on the orientation of the vector of their relative position. This text is mostly concerned with quantum mechanics of a single particle in an external potential (a two-particle problem can often be presented in this form as well). If the external potential belongs to the class of central potentials, it can be shown that the Hamiltonian of such a system commutes with all components of the angular momentum as well as with operator OL2. The proof of this statement requires proving it separately for kinetic energy operator (essentially for operator Op2) and for the potential energy operator V .Or/. I believe that the readers of this text are already equipped to prove that

h OLx;y;z; Op2 i

D h OL2; Op2

i D 0, so I leave it to you as an exercise.

As far as the commutators with the potential energy operator go, this proof will have to be left till later.

Vanishing of the commutators of angular momentum operators and the Hamil- tonian means that the Hamiltonian, OL2, and one of the components of the angular momentum form a system of commuting operators and that the eigenvectors of OL2 and, say, OLz are also eigenvectors of the Hamiltonian. This fact can significantly simplify finding eigenvalues and eigenvectors of the Hamiltonian.

It is also remarkable that the eigenvalues of OL2 and, for instance, OLz can be found using only commutation relations given by Eqs. 3.53–3.55. The choice of the z- component here is a random historical occasion and does not have any physical significance. By choosing this particular component, which, you shall understand, is attached to a particular choice of the coordinate system, we essentially say to the experimentalists that if the quantum system is in a state described by the eigenvectors of OLz as defined by this coordinate system, then a measurement of a component of the angular momentum in the same direction will produce results corresponding to the respective eigenvalue with certainty, while measurements of any other component of the angular momentum will have quantum uncertainty.

74 3 Observables and Operators

I begin the search for the eigenvalues by introducing abstract vectors j�L; �zi defined as common eigenvectors of operators OL2 and OLz characterized by some yet unknown eigenvalues �L and �z:

OL2 j�L; �zi D �L j�L; �zi ; (3.57) OLz j�L; �zi D �z j�L; �zi : (3.58)

It is convenient to present these eigenvalues as �L D „2p and �z D „m. Pulling out factors „2 and „ from eigenvalues of OL2 and OLz, respectively, makes the remaining quantities p and m dimensionless since the dimension of the angular momentum is the same as that of Planck’s constant. Apparently, I will need to invoke, somehow, two remaining components of the angular momentum. It is not right away obvious how to do it, but let’s say that I have had a divine intervention or premonition that the following two new operators might be useful:

OLC D OLx C i OLy; (3.59) OL� D OLx � i OLy: (3.60)

The first thing I need to do with these operators is to compute their commutators with operators OL2 and OLz:

h OLC; OLz i

D h OLx; OLz

i C i

h OLy; OLz i

D �i„OLy � „OLx D �„OLC; (3.61) h OL�; OLz

i D h OLx; OLz

i � i

h OLy; OLz i

D �i„OLy C „OLx D „OL�: (3.62)

It is also easy to see that commutators h OL2; OL˙

i vanish. Indeed, OL2 commutes with

all component operators and, therefore, with OL˙, which are combinations of OLx andOLy. Now, the new operators for a theoretician are like new toys for a child, and I am eager to play with them and see what they can do. So, to satisfy the urge, and in hopes to learn something new, I want to apply operators OL˙ to Eq. 3.57:

OL˙ OL2 j�L; �zi D OL2 OL˙ j�L; �zi D „2p OL˙ j�L; �zi ; (3.63)

where I used OL2L˙ D OL˙ OL2. OK, and what did we learn from this exercise? Well, I know now that if j�L; �zi is the eigenvector of OL2 with eigenvalue „2p, then vectorOL˙ j�L; �zi is still the eigenvector of OL2 with the same eigenvalue, which is not really surprising because L˙ do commute with OL2. So far, it is not much, and you would be right to say that so far operators OL˙ have not given us any particular advantages because we would have gotten the same result with operators OLx;y. But let’s not jump the gun—always a bad idea—while patience and persistence are virtues. Instead, let me play another game and apply OLC to Eq. 3.58:

OLC OLz j�L; �zi D „m OLC j�L; �zi ;

3.3 Operators and Observables 75

� OLz OLC � „OLC �

j�L; �zi D „m OLC j�L; �zi ; OLz OLC j�L; �zi D „ .m C 1/ OLC j�L; �zi ; (3.64)

where I used commutation relation 3.61 to make the transition from the first to the second line. Now, the last line in Eq. 3.64 tells us that OLC j�L; �zi is an eigenvector of OLz with eigenvalue „m C „. This is a quite exciting result: it means that if I start with some eigenvector with a known eigenvalue, I can generate new eigenvectors with progressively increasing eigenvalues: „m C „; „m C 2„; „m C 3„ : : : . This is already something new, which we could not have gotten without the operator OLC. The secret of this operator lies in its commutator with OLz, which is proportional toOLC itself. The same is true for operator OL�, so it is worth looking into what this operator can do:

OL� OLz j�L; �zi D „m OL� j�L; �zi ; � OLz OL� C „OL�

� j�L; �zi D „m OL� j�L; �zi ;

OLz OL� j�L; �zi D „ .m � 1/ OL� j�L; �zi : (3.65)

When deriving Eq. 3.65, I again applied commutator from Eq. 3.62 to its first line. The final result of this calculation indicates that operator OL� also generates new eigenvectors of OLz but with progressively decreasing eigenvalues. Not surprisingly operators OLC and OL� are called raising and lowering ladder operators.

Now, the question arises: will this process of generating new eigenvectors and eigenvalues ever stop? In other words, can operator OLz have arbitrary large and arbitrary small eigenvalues? Intuitively, it is clear that the answer to this question must be negative and that the possible eigenvalues of OLz must be limited both from above and from below. Indeed, these eigenvalues represent possible results of the measurement of one component of a vector, while eigenvalues of OL2 represent possible experimentally observable values of the squared magnitude of the same vector. It is difficult to imagine that the component of a vector can be larger than the magnitude of the same vector, and therefore one should expect that there must be some kind of a relation between these two eigenvalues, e.g., something like this m2 < p. In order to see if such a relation, indeed, exists, consider the following expression:

h�L; �zj OL2 j�L; �zi D h�L; �zj OL2x j�L; �zi C h�L; �zj OL2y j�L; �zi C h�L; �zj OL2z j�L; �zi :

Taking into account Eqs. 3.57 and 3.58, this can be written as

„2p D h�L; �zj OL2x j p; �zi C h�L; �zj OL2y j p; �zi C „2m2: (3.66)

76 3 Observables and Operators

Since the expectation values of operators OL2x and OL2y in any state are positive quantities, Eq. 3.66 yields that p > m2. This means that there exists the smallest m, which I will designate as l, and there exists the largest m, for which I will use symbol l. Now, assume that you are dealing with the eigenvector

ˇ̌ �L; „Nl

˛ and applying

operator OL� to it. Generally speaking, this operator must lower the eigenvalue, but we assumed that this eigenvalue is already the lowest. The only way to reconcile Eq. 3.65 with this assumption is to require that

OL� ˇ̌ �L; „Nl

˛ D 0: (3.67)

In order to figure out how to use this important piece of information, I again need a bit of divine inspiration, or I can just notice that the product of operators OLC OL� can be expressed in terms of operators OL2 and OLz:

OLC OL� D OL2x C OL2y C i OLy OLx � i OLx OLy D OL2 � OL2z C „OLz:

Rewriting this expression as

OL2 D OL2z � „OLz C OLC OL�; (3.68)

and applying it to vector ˇ̌ �L; „Nl

˛ while taking into account Eq. 3.67, I obtain

OL2 ˇ̌�L; „Nl ˛ D OL2z

ˇ̌ �L; „Nl

˛ � „OLz ˇ̌ �L; „Nl

˛C OLC OL� ˇ̌ �L; „Nl

˛ ) „2p ˇ̌�L; „Nl

˛ D „2Nl2 ˇ̌�L; „Nl ˛ � „2Nl ˇ̌�L; „Nl

˛ ) p D Nl2 � Nl: (3.69)

Now, consider the state characterized by the largest values of m, j�L; „li. Attempting to act on this vector with operator OLC leaves you with the same conundrum encountered when discussing vector

ˇ̌ �L; „Nl

˛ , but by now you know the way out:

you must require that

OLC j�L; „li D 0: (3.70)

The derivation of Eq. 3.69 based on Eq. 3.67 was successful because the lowering operator OL� appears in this equation after operator OLC. Consequently, when the product OLC OL� is made to act on

ˇ̌ �L; „Nl

˛ , the resulting expression vanishes. In order

to achieve the same effect with state j�L; „li and Eq. 3.70, I need to modify Eq. 3.68 in such a way that it would contain combination OL� OLC instead of OLC OL�. To achieve this, consider

OL� OLC D OL2x C OL2y � i OLy OLx C i OLx OLy D OL2 � OL2z � „OLz (3.71)

3.3 Operators and Observables 77

which can be rewritten in the desired form

OL2 D OL2z C „OLz C OL� OLC: (3.72)

Now applying OL2 to j�L; „li and using Eqs. 3.72 and 3.70, I get

p D l2 C l: (3.73)

Comparing Eq. 3.69 with Eq. 3.73, I infer that smallest and largest eigenvalues of OLz are related to each other as

l2 C l D Nl2 � Nl:

It is easy to see (one can always just solve the quadratic equation for Nl) that this relation implies that Nl D �l or Nl D l C 1. The latter solution contradicts to the assumption that Nl is the smallest eigenvalue and l is the largest; thus the only possibility which makes sense is Nl D �l.

Now imagine that you have found the smallest eigenvalue �l and you start applying operator OLC to state j�L;�„li. After each application of the operator, the eigenvalue of OLz increases by one, so that after applying it N times, you end up with eigenvalue �l C N. Eventually you must reach the largest eigenvalue l, at which point you will have �l C N D l ) 2l D N. N is apparently an integer number, so l can be either integer, if N is even, or half-integer, if N is odd.

Now, let us gather our thoughts and try to summarize what it is that we have got:

1. The eigenvalue of operator OL2 is equal to „2l.l C 1/, where l determines the maximum eigenvalue of the operator OLz, „l.

2. l can take either integer or half-integer values, forming two non-overlapping series of allowed values: 0; 1; 2; 3 � � � or 1=2; 3=2; 5=2 � � � .

3. Allowed values of m start at �l and advance increasing by one until it reaches l. For instance, for l D 0; the only possible value of m is zero; for l D 1=2, m can be �1=2; 1=2; and for l D 1, we can have states with m D �1; 0; 1. In general for a state characterized by the same eigenvalue of operator OL2, „2l.l C 1/, there are 2l C 1 possible states with different eigenvalues of OLz.

It is interesting to note that if I were talking about a classical vector, the maximum magnitude of its component along an axis would simply equal to the length of the vector. If we interpret expression „l as such a component’s length, then the squared length of the entire vector would have been „2l2, which is different from the quantum result „2l2 C „2l. One can see that the “extra” contribution to the “length” comes from fluctuations of two other components of the angular momentum. Indeed, using what you have learned from Eq. 3.66, you can write

78 3 Observables and Operators

„2l2 C „2l D „2l2 C hl; lj OL2x jl; li C hl; lj OL2y jl; li ) hl; lj OL2x jl; li C hl; lj OL2y jl; li D „2l ) hl; lj OL2x jl; li D hl; lj OL2y jl; li D „2l=2:

In the last expression, I introduced a shortcut notation for the common eigenvectors of operators OL2 and OLz, which in general looks like jl;mi with the first number indicating that this vector belongs to the eigenvalue „2l.l C 1/ of OL2 and the second number pointing at the eigenvalue „m of OLz. For brevity, l is often referred to as the “angular momentum,” and m is often called a “magnetic” quantum number. The origin of this name will become clear later, when we get to consider the behavior of atoms in the magnetic field.

Finally, let me note that even though we know now that ladder operators OL˙ generate eigenvectors of OLz, there is no guarantee that the resulting eigenvectors will be normalized even if the initial vector is. So, in order to finalize the rule for obtaining normalized eigenvectors using ladder operators, we have to analyze their action more carefully. First, it is easy to see that they are Hermitian conjugates of each other:

OL� D OL�C: (3.74)

Assuming that vectors jl;mi and jl;m C 1i are normalized and introducing yet unknown normalization coefficient, I can write

OLC jl;mi D Al;m jl;m C 1i :

The Hermitian conjugation of this expression yields

hl;mj OL� D hl;m C 1j A�l;m:

Multiplying the left-hand side of this equation by the left-hand side of the previous one and doing the same to their right-hand sides yields

hl;mj OL� OLC jl;mi D A�l;mAl;m hl;m C 1j jl;m C 1i :

Since it was assumed that all ket vectors are normalized, I now immediately have for jAlmj2:

jAlmj2 D hl;mj OL� OLC jl;mi :

Taking into account Eq. 3.71, and the fact that kets in this expression are eigenvec- tors of OL2 and OLz, I find

jAlmj2 D „2 Œl .l C 1/ � m .m C 1/�

3.3 Operators and Observables 79

which allows to establish the final rule for the generation of new eigenvectors from the known ones:

OLC jl;mi D „ p

l .l C 1/ � m .m C 1/ jl;m C 1i : (3.75)

I will leave it to you to show that

OL� jl;mi D „ p

l .l C 1/ � m .m � 1/ jl;m � 1i : (3.76)

To conclude this section, let me just emphasize once again that we were able to find eigenvalues for the system of operators, as well as a rule for generating their eigenvectors, using nothing but their commutation relations. The key to successful completion of this task was the existence of the ladder operators with their very special commutation relations given by Eqs. 3.61 and 3.62.

3.3.5 Statistical Interpretation

In Chap. 2 I have already introduced the relation between coefficients in the superposition states and probabilities of various outcomes of the measurements on quantum systems. This time I will elaborate those ideas in a more precise way by formulating two postulates introducing statistical interpretation to the formalism of quantum mechanics.

Postulate 4 (Born’s Rule) A measurement of an observable can only yield a value from the set of the eigenvalues of the operator representing the measured observable. If a system before the measurement is not in a state described by one of the eigenvectors of this operator, the result of the measurement cannot be predicted a priori. Only a probability (or probability density for observables with continuous spectrum) of a particular outcome can be known. If the measured eigenvalue is not degenerate, this probability is given by

pn D jh˛j �nij2 ; (3.77)

where j˛i represents a state of the system before the measurement, �n is one of the eigenvalues, and j�ni is the corresponding eigenvector. If the eigenvalue is degenerate, the probabilities given by Eq. 3.77 must be summed up with other degenerate states belonging to this eigenvalue. In the case of observables with continuous spectrum, the probability is replaced with probability density p.q/:

p.q/ D jh˛j qij2 ;

which determines a differential probability dP that the measured value of the observable lies within interval of values Œq; q C dq� as dp D p.q/dq.

80 3 Observables and Operators

Postulate 5 Regardless of the state in which the system was before an observable is measured, immediately after the measurement, the system will be in a state represented by the eigenvector of the corresponding operator belonging to the observed non-degenerate eigenvalue. If the measured eigenvalue is degenerate, all we can state is that after the measurement the system will be in a state in the subspace of eigenvectors belonging to this eigenvalue.

Both these postulates are essentially more accurate restatements of the proposi- tions already discussed in Sect. 2.2.3, where somewhat vague notion of “the state with definite values of an observable” is replaced with its mathematical representa- tion as an eigenvector of a respective operator. This more formal approach allows carrying out a more comprehensive exploration of the statistical interpretation of quantum mechanical formalism.

I begin by considering an expression of the form h˛j OT j˛i, where j˛i is an arbitrary state and OT is a Hermitian operator representing a certain observable. I have already mentioned that this expression is often referred to as “expectation value,” but now I can demonstrate what it actually means. Expanding this state into eigenvectors of OT (Eq. 3.39), I can present h˛j OT j˛i as

h˛j OT j˛i D X

n

X m

a�n am h�nj OT j�mi D X

n

X m

�ma � n am h�nj �mi D

X n

�n janj2 ; (3.78)

where I first took advantage of the fact that j�mi is an eigenvector of OT with eigenvalue �m: OT j�mi D �m j�mi and then used orthonormalization condition for the eigenvectors, h�nj �mi D ınm. According to Born’s rule, janj2 is the probability that the measurement of the observable will produce �n. Then, it becomes clear that the final result in Eq. 3.78 has the meaning of the average value of the observable, which one would “expect” to find if the same measurement is repeated multiple times or if an experimentalist carries out the measurement on multiple identical copies of the same system. The simplest measure of the statistical uncertainty of such measurements would be the standard deviation, which in regular probability theory would be defined as

�T D q �2 � �2

where the bar above the letters means statistical averaging with probabilities given

by pn D janj2: �2 D Pn pn�2n; � 2 D �Pn pn�n

�2 . In the context of quantum theory,

the measure of uncertainty of a measurement can be described as

�T D r

h˛j OT2 j˛i � � h˛j OT j˛i

�2 : (3.79)

3.3 Operators and Observables 81

Indeed,

h˛j OT2 j˛i D X

n

X m

a�n am h�nj OT OT j�mi D X

n

X m

�ma � n am h�nj OT j�mi

D X

n

X m

�2ma � n am h�nj �mi D

X n

�2n janj2 :

This shows that the measure of uncertainty expressed by Eq. 3.79 does agree with the probabilistic definition of the standard deviation. If state j˛i is one of the eigenvectors j�n0i, all coefficients an are zeroes, with the exception of an0 D 1. In this case, we have h˛j OT2 j˛i D �2n0 D

� h˛j OT j˛i

�2 , and uncertainty �T vanishes.

This justifies calling states represented by eigenvectors determinant states or states in which the observable has a definite value. If there are several mutually consistent observables represented by commuting operators, we can have a state, which is a common eigenvector of all operators, in which all observables will have definite values.

If two observables are not mutually consistent and are described by operators OT1 and OT2 that do not commute, one can derive the following inequality for uncertainties of these operators �T1 and �T2 :

�T1�T2 � 1

2 h˛j

ˇ̌ ˇ h OT1; OT2

iˇ̌ ˇ j˛i (3.80)

which is valid for an arbitrary state j˛i. This is the so-called generalized uncertainty principle. Using canonical commutation relations 3.47, I can immediately reproduce the Heisenberg inequality

�x�p � 1 2

„ (3.81)

which now becomes a particular case of a more general result presented by Eq. 3.80. It is interesting that using Heisenberg uncertainty principle, Eq. 1.4, as an empiric formula and combining it with Eq. 3.80, I can “derive” or justify, if you want, the canonical commutator between the coordinate and momentum operators. Indeed, since Eq. 3.81 is valid for an arbitrary state, in order to reconcile Eq. 3.81 with Eq. 3.80, I have to admit that the commutator of coordinate and momentum operators must be a regular number (only in this case the right-hand side of Eq. 3.81 becomes proportional to h˛j ˛i D 1, so that the dependence on the state vanishes). The absolute value of this number must obviously be equal to „, but recalling that if the commutator of two Hermitian operators is a number, it must be an imaginary number (see Eq. 3.31), I can conclude that ŒOx; Opx� D i„, which is the canonical commutation relation given in Eq. 3.47. Of course, these arguments are not sufficient to show if this commutator is Ci„ or �i„, but the choice of the sign is, actually, the matter of convention, and the standard agreement is to write this commutator as given in Eq. 3.47.

82 3 Observables and Operators

To illustrate all these rather abstract postulates, I will finish this section with an example, in which, to save time, I will again use matrices M1 and M2 defined by Eq. 3.37.

Example 14 (Probabilities of Measurements) Assume that these matrices represent two observables of some quantum system and that you intend to measure these observables. It is given that the system is prepared in the state j i represented by the column

j i D 1p 7

2 4 2i 1

1 � i

3 5

and you are asked to predict the results of the different sequence of measurements of observables M1 and M2. The first step you have to do is to verify that your initial state is normalized, which is just a good housekeeping habit. The norm of this vector is (do not forget to do complex conjugation when converting ket into a bra—for some reason even good students keep forgetting about it)

k k D 1 7

��2i 1 1C i� 2 64 2i

1

1 � i

3 75 D

1

7 ..�2i/ .2i/C 1C .1C i/ .1 � i// D

1

7 .4C 1C 2/ D 1:

Once normalization is verified, you are ready for the next step. Let’s say you first want to measure the observable represented by M2. We found earlier that the eigenvalues of this matrix are �1 D 2, �2 D 1, and �3 D �1. Thus, these are the values that you can expect to see on the dial of your measuring device (more or less, experimental errors are unavoidable, of course). The actual issue is to find the corresponding probabilities. Using Born’s rule, Eq. 3.77, and the corresponding eigenvectors given in Eq. 3.38, you can find for each of the eigenvalues

p�1 D jh j 2ij2 D

ˇ̌ ˇ̌ ˇ̌ ˇ 1p 2

1p 7

��2i 1 1C i� 2 64

�1 0

1

3 75

ˇ̌ ˇ̌ ˇ̌ ˇ

2

D

1

14 j2i C 1C ij2 D 5

7

3.3 Operators and Observables 83

p�2 D jh j �1ij2 D

ˇ̌ ˇ̌ ˇ̌ ˇ 1

2

1p 7

��2i 1 1C i� 2 64 1p 2

1

3 75

ˇ̌ ˇ̌ ˇ̌ ˇ

2

D

1

28

ˇ̌ ˇ�2i C

p 2C 1C i

ˇ̌ ˇ 2 D

� 1C p2

�2 C 1 28

D 2C p 2

14

and

p�3 D jh j 1ij2 D

ˇ̌ ˇ̌ ˇ̌ ˇ 1

2

1p 7

��2i 1 1C i� 2 64

1

�p2 1

3 75

ˇ̌ ˇ̌ ˇ̌ ˇ

2

D

1

28

ˇ̌ ˇ�2i �

p 2C 1C i

ˇ̌ ˇ 2 D

� 1 � p2

�2 C 1 28

D 2 � p 2

14 :

It is always a good idea to run a quick check:

p�1 C p�2 C p�3 D 5

7 C 2C

p 2

14 C 2 �

p 2

14 D 5 7

C 2 7

D 1;

as it should be. So far so good. The expectation value of M2 can be computed in two different ways. First, I will use the standard probabilistic definition of the average

hM2i D . p�1�1 C p�2�2 C p�3�3/ D

2 � 5 7

C .�1/2C p 2

14 C 12 �

p 2

14 D 10 �

p 2

7 :

And I will also compute this quantity using quantum-mechanical definition:

hM2i � h j cM2 j i D

1

7

��2i 1 1C i� 2 664

1 � 1p 2

�1 � 1p

2 0 � 1p

2

�1 � 1p 2

1

3 775

2 64 2i

1

1 � i

3 75 D

1

7

��2i 1 1C i� 2 664 2i � 1p

2 � 1C i

� 2ip 2

� 1�ip 2

�2i � 1p 2

C 1 � i

3 775 D

84 3 Observables and Operators

1

7

��2i 1 1C i� 2 664 3i � 1p

2 � 1

� 1Cip 2

�3i � 1p 2

C 1

3 775 D

1

7

6C 2ip

2 C 2i � 1C ip

2 � 2i � 1p

2 C 4 � ip

2

D 1 7

� 10 � p2

� ;

again, exactly as promised. If immediately after measuring M2 you will attempt to measure M1 and are interested in probabilities of various outcomes (now you are talking about outcomes consisting of pairs of measurements, which are given by all nine possible pairs of eigenvalues .�.M2/i ; �

.M2/ j /), you have to take into account

that after the first measurement, the system is no longer in the initial state j i. Depending on the outcome of the first measurement, it will be in a state presented by one of the eigenvectors of M2. However, since these two matrices commute, and the eigenvectors of M1 are also eigenvectors of M2, the outcomes of the second measurement are completely determined by the outcome of the first, and there are only three possible results. For instance, if the first measurement produced for M2 value �1 (probability .2 � p2/=14), the measurement of M1 will be guaranteed to yield 2 (the state corresponding to eigenvalue �1 of matrix M2 is described by the same vector as the eigenvector of M1 belonging to its eigenvalue 2). Thus, the probability of getting the pair .�1; 2/ is still .2 � p2/=14.

If you measure M1 first, the situation is a bit more complex since M1 has degener- ate eigenvalues. So, if you want, for instance, to find the probability of getting 1 after measuring M1, you have to compute two probabilities—one for each degenerate state—and sum them up. To do that you can use the corresponding orthogonal and normalized vectors given in Eq. 3.38, which are common eigenvectors of both M1 and M2. This will yield

p1 D

ˇ̌ ˇ̌ ˇ̌ ˇ 1p 2

1p 7

��2i 1 1C i� 2 64

�1 0

1

3 75

ˇ̌ ˇ̌ ˇ̌ ˇ

2

C

ˇ̌ ˇ̌ ˇ̌ ˇ 1

2

1p 7

��2i 1 1C i� 2 64

1

�p2 1

3 75

ˇ̌ ˇ̌ ˇ̌ ˇ

2

D 10 14

C 4 � 2 p 2

28 D 12 �

p 2

14 :

At this point a question might pop up in your head, if this result is unique. Indeed, you already know that degenerate eigenvalues can be characterized by an infinite number of different normalized and orthogonal eigenvectors. It would be nice if the probability would not depend on this arbitrary choice, but is it really so? I will give you a chance to answer this question as an exercise.

3.4 Problems 85

Finally let me compute the uncertainty of the observable M2 in this experiment. For this computation I need to first find M22, which is

M22 D

2 64

5 2 0 � 3

2

0 1 0

� 3 2 0 5

2

3 75 :

Now you can compute

h j cM22 j i D 1 7

��2i 1 1C i� 2 64

5 2 0 � 3

2

0 1 0

� 3 2 0 5

2

3 75

2 64 2i

1

1 � i

3 75 D

1

7

��2i 1 1C i� 2 64

� 3 2

C 13 2

i

1

� 5 2

� 11 2

i

3 75 D 17

7

so that the uncertainty �2M2 is found to be

�2M2 D h j cM22 j i � h j cM2 j i2 D 17

7 � 1 49

� 10 � p2

�2 D 17C 20 p 2

49 :

3.4 Problems

Section 3.1

Problem 10 A constant force F is acting on a particle of mass m. Derive an expression for the potential energy associated with this force, write down the Hamiltonian of the system, and derive Hamiltonian equations.

Problem 11 Consider a particle moving in a central potential field with Hamilto- nian

H D p 2

2m C V .jrj/ :

Compute the following Poisson bracket:

fLx;Hg ; ˚ Ly;H

� ; fLz;Hg ;

where Lx;y;z are Cartesian coordinates of angular momentum of the particle in some arbitrarily chosen coordinate system. Interpret the results.

86 3 Observables and Operators

Section 3.2.1

Problem 12 Which of the following is a linear operator?

1. Inversion operator OP, which acts on functions of coordinates according to the rule OPf .r/ D f .�r/.

2. Square operator OS defined as OSf D f 2. 3. Determinant operator bDet, which when applied to a square matrix turns it into

the matrix’s determinant. 4. Exchange operator OE acting on functions of two variables as OEf .x1; x2/ D

f .x2; x1/. 5. Trace operator bTr, which acts on a matrix and turns it into the sum of its diagonal

elements.

Problem 13 Prove the linearity of the rotation operator.

Problem 14 Find a Hermitian conjugate for the integral operator OK acting on integrable functions of a single variable and defined by kernel K .x1; x2/:

OKf D 1̂

�1 K.x1; x2/f .x2/:

The inner product is defined in a regular way: hgj f i D ´ 1�1 g�.x/f .x/dx. Determine under which condition on the kernel this operator is Hermitian.

Problem 15 Expression OP D j˛i hˇj can be understood as an operator acting in the following way:

OP j�i � j˛i hˇj �i :

Find its Hermitian conjugate.

Section 3.2.2

Problem 16 Specify the condition that must be obeyed by an operator so that it is both unitary and Hermitian. Consider the following matrices:

" 1 0

0 �1

# ;

" 0 1

1 0

# ;

" 0 i

�i 0

# :

Do they satisfy this condition?

3.4 Problems 87

Problem 17 For three operators OA; OB, and OC, prove the following identity (known as Jacobi identity):

hh OA; OB i ; OC i

C hh OC; OA

i ; OB i

C hh OB; OC

i ; OA i

D 0:

Problem 18 Which of the following matrices are Hermitian?

1.

2 64 3i 5i 7

�5i 2 3 7 3 0

3 75

2.

2 64 1 i 2i

�i 0 3 �2i 3 2

3 75

3.

2 64

p 2 1 �2

�1 2 4p5 7 �4p5 p3

3 75

4.

2 64 7 4 2

4 2 1

2 1 �4

3 75

Problem 19 Prove the identity

� OA OB ��1 D OB�1 OA�1:

Problem 20 Prove the following properties of the commutators:

h OT1 O;T2 i

D � h OT2 O;T1

i

h OT1 C OT2; ; OT3 i

D h OT1 O;T3

i C h OT2 O;T3

i

h c1 OT1; c2 OT2

i D c1c2

h OT1 O;T2 i :

88 3 Observables and Operators

Problem 21 If operator OD is defined as

ODf .x/ D df dx ;

what would be an inverse of this operator?

Problem 22 Find an inverse of the following matrices:

1.

2 64 1 i 2i

�i 0 3 �2i 3 2

3 75

2.

2 64 0 i 2

�i 0 1 �i i 0

3 75

Problem 23 Consider an operator O� characterized by the following property: O�2 D OI, where OI is a unity operator. Using power series expansion, find the closed-form expression (not in the form of a series) for the operator exp .i O� t/. Problem 24 Prove that if the commutator of two Hermitian operators is a number, this number is necessarily imaginary.

Problem 25 Given that ŒOx; Op� D i„, compute �Ox2; Op2� :

Section 3.2.3

Problem 26 Consider matrices

" 0 i

�i 0

# and

" 0 1

1 0

# .

1. Find the eigenvalues and normalized eigenvectors of these matrices. 2. Check orthogonality of the found vectors.

Problem 27 Consider two matrices:

A1 D

2 64 1 0 0

0 �1 0 0 0 �1

3 75 I A2 D

2 64 1 0 0

0 0 1

0 1 0

3 75 :

3.4 Problems 89

1. Show that these operators commute. 2. Find a set of eigenvectors common for both of them.

Problem 28 Find eigenvalues and normalized eigenvectors of the following matrix:

2 664

1 � 1p 2

�1 � 1p

2 0 � 1p

2

�1 � 1p 2

1

3 775 :

Problem 29 Consider the following matrix:

A D

2 64 0 0 �1 0 1 0

�1 0 0

3 75 :

1. Find its eigenvalues. Are there degenerate ones? 2. Construct a system of normalized and orthogonal eigenvectors. 3. Show that

exA D cosh x C A sinh x:

Section 3.3.1

Problem 30 Consider an operator defined as

OA D j 1i h 1j C j 2i h 2j C j 3i h 3j � i j 1i h 2j � j 1i h 3j C i j 2i h 1j � j 3i h 1j

where j 1i ; j 2i, and j 3i form an orthonormalized basis. 1. Check if this operator is Hermitian by computing OA�. 2. Compute OA2. 3. What are the possible values an experimentalist can observe when measuring an

observable represented by this operator? 4. Find states in which the system will be immediately after the measurement for

each of the possible outcomes. Verify that the states are presented by orthogonal vectors.

Problem 31 Show that if OP is a projection operator, OI � OP is also a projection operator.

90 3 Observables and Operators

Section 3.3.2

Problem 32 Derive the commutation relations h OLz; OLx

i D i„OLy

h OLy; OLz i

D i„OLx:

Problem 33 Prove that the commutator of the operator of the square of angular momentum OL2 commutes with all components of the angular momentum operator, OLx;y;z. Problem 34 Compute commutators

h OLz; Ox i ; h OLz; Oy

i ; h OLz; Oz

i

h OLz; Opx i ; h OLz; Opy

i ; h OLz; Opz

i

h OL2; Ox i ; h OL2; Oy

i ; h OL2; Oz

i

h OL2; Opx i ; h OL2; Opy

i ; h OL2; Opz

i :

Problem 35 Prove that h OLx;y;z; Op2

i D h OL2; Op2

i D 0:

Section 3.3.4

Problem 36 Prove that

OL� jl;mi D „ p

l .l C 1/ � m .m � 1/ jl;m � 1i :

Problem 37 Compute the following expressions:

hl;m0j OL� jl;mi hl;m0j OLC jl;mi :

For l D 1 present the results as a matrix.

3.4 Problems 91

Problem 38 Compute ˝ l;m0

ˇ̌ OL2x jl;mi :

Hint: Use the representation of OLx in terms of raising and lowering ladder operators.

Section 3.3.5

Problem 39 An observable A represented by an operator OA can be in two mutually exclusive states represented by eigenvectors of OA ja1i and ja2i, where a1;2 are corresponding eigenvalues. The second observable B represented by an operator OB also can be in two mutually exclusive states represented by eigenvectors of OB jb1i and jb2i, where b1;2 are corresponding eigenvalues. These eigenvectors can be related to each other as

ja1i D 1 5 .3 jb1i C 4 jb2i/

ja2i D 1 5 .4 jb1i � 3i jb2i/ :

1. If observable A is measured and value a1 is obtained, what is the state of the system immediately after the measurement?

2. If now B is measured, what are the possible outcomes, and what are their probabilities?

3. Right after B was measured, A is measured again. What is the probability of getting a1 for different possible outcomes of the first measurement?

Problem 40 A quantum system is in a state described by a vector

j˛1i D ip 3

j�1i C p 2p 3

j�2i :

Find the probability that a measurement of some observable will bring the system to state described by a vector

j˛2i D 1C ip 3

j�1i C 1p 6

j�2i C 1p 6

j�3i

where j�1;2;3i form an orthonormalized basis. Problem 41 Consider a quantum system in a state described by a column vector

j i D 1p 5

2 64

�i 2

0

3 75 :

92 3 Observables and Operators

The system is characterized by two observables T1 and T2 presented by matrices

T1 D

2 64 1 i 1

�i 0 0 1 0 0

3 75 I T2 D

2 64 3 0 0

0 1 i

0 �i 0

3 75 :

1. If T1 is measured first and T2 immediately afterward, what is the probability of obtaining �1 for T1 and 3 for T2?

2. What are the probabilities of getting the same values if the order of measurements is reversed? Discuss the result in terms of commutation properties of the two matrices.

Problem 42 Consider a system described by the Hamiltonian

H D 1p 2

2 64 0 �i 0 i 3 3

0 3 0

3 75

placed in a quantum state described by a column vector

j i D

2 64 4 � i

�2C 5i 3C 2i

3 75 :

1. Find the expectation value of energy in this state. 2. Find the uncertainty of energy in this state. 3. Find the possible values of energy measurements and their probabilities. 4. Use the results of the previous task to calculate the expectation value and

uncertainty of energy again. Compare the results with results of tasks 1 and 2.

Problem 43 Go back to Example 14 at the end of the chapter, and using a different set of orthogonal and normalized eigenvectors of M1 (you will have to find it first, of course), compute the probability of getting the degenerate eigenvalue of M1. Is the result the same?

Problem 44 Consider a system described by a Hamiltonian

OH D �1 2

d2

dx2 C 1 2

x2

presented by an operator acting on square-integrable functions of a single variable x forming a Hilbert space with an inner product defined in Sect. 2.1. This system is prepared in state

j i D 1p 3

j 1i C p 2p 3

j 2i

3.4 Problems 93

where vectors j 1;2i are defined as the following functions:

j 1i D exp

�x 2

2

I j 2i D

� 1 � 2x2� exp

�x

2

2

:

1. Verify that these functions are eigenvectors of the Hamiltonian, determine the respective eigenvalues, and normalize the eigenvectors.

2. Rewrite the expression for the state j i in terms of normalized versions of the vectors j 1;2i.

3. If the energy of the system is measured, what are the possible outcomes, and what are their probabilities?

4. Find expectation values and uncertainties of the operators

O� f .x/ D �i df dx

I Oxf .x/ D xf .x/

in state j i.

Chapter 4 Unitary Operators and Quantum Dynamics

In the previous section, I explained how one can dig out experimentally relevant information using states of a quantum system and operators representing the observ- ables. The remaining burning question, however, is how can we find these states so that we could use these methods. In a typical experiment, an experimentalist begins by “preparing” a quantum system in some state, which they believe they know.1 After that they smash the system with a hammer, or hit it by a laser light, or subject it to an electric or magnetic field, wait for some time, and measure new values of the selected observables. In order to predict the results of new measurements, you must be able to describe how the quantum system changes between the time of preparation and the time of subsequent measurement, or, speaking more scientifically, you must know its dynamics. As it has been made clear in the previous section, you need two objects to predict the results of a measurement: a state of the system and the operator assigned to the measured observable. Now you can ask an interesting question: “When the quantum system evolves in time, what is actually changing—the state or the operator?” To make this question more specific, consider an expectation value of an observable described by operator OT: h˛j OT j˛i. When your system evolves, this expectation value becomes a function of time. The question is, which element of the expression for the expectation value, OT or j˛i, must be considered as a time-dependent quantity to describe the dynamics of the expectation value? It turns out that time dependence can be ascribed to either of these two elements, and depending on the choice, it will generate two different but equivalent pictures of quantum mechanics. In the so-called Schrödinger picture, the state vectors are treated as time-dependent quantities, while operators remain fixed rules transforming the states. In the Heisenberg picture, the state vector is considered as a constant, and all the dynamics of the system is ascribed to the time- dependent operators. The origins of these two pictures can be found in the earlier

1Preparation of a quantum system in a predefined state usually consists in carrying out a measurement, but it is not an easy task to prepare a system in a state we want.

© Springer International Publishing AG, part of Springer Nature 2018 L.I. Deych, Advanced Undergraduate Quantum Mechanics, https://doi.org/10.1007/978-3-319-71550-6_4

95

96 4 Unitary Operators and Quantum Dynamics

days of quantum theory with the Heisenberg matrix mechanics competing against Schrödinger’s matter wave theory. The first attempt to prove equivalence of the two pictures was undertaken by Schrödinger as early as in 1926, but the rigorous math- ematical proof of the equivalence did not exist until John von Neumann published in 1932 his definitive book Mathematical Foundations of Quantum Mechanics.

Von Neumann was one of the major figures in mathematics and mathematical physics of the twentieth century. Born to a rich Jewish family in Hungary, which was elevated to nobility by Austro-Hungary Emperor Franz Joseph (hence the prefix von in his name), he was a child prodigy, got his Ph.D. in mathematics at the age of 23, and became the youngest privatdocent at the University of Berlin. In 1929 he got an offer from Princeton University and moved to the USA. He brought his entire family to America in 1938 saving them from certain death. In addition to laying rigorous mathematical foundation to quantum theory, von Neumann is famous for his role in the Manhattan Project and developing the concept of digital computers (among other things).

After this brief historical detour, I begin presentation of quantum dynamics starting with the Schrödinger picture.

4.1 Schrödinger Picture

4.1.1 Time-Evolution Operator and Schrödinger Equation

The statistical interpretation of quantum mechanical formalism makes sense only if all vectors describing states of quantum system remain normalized at all times. I will begin digging deeper into this issue by computing the norm of a generic vector k˛k using Eq. 3.39. First, I need the corresponding bra vector:

h˛j D X

n

a�n h�nj

so that I can write for the norm

k˛k2 D h˛j ˛i D X

m

X n

ama � n h�nj �mi D

X m

X n

ama � n ınm D

X n

janj2 : (4.1)

According to the postulate 4 in Sect. 3.3.5, janj2 is equal to probability pn that the respective eigenvalue will be observed. Equation 4.1 in this case can be interpreted as a statement that the norm of a generic vector is equal to the sum of probabilities of all possible measurement outcomes. The latter must obviously be equal to unityP

n pn D 1 regardless of the time dependence of state j˛i. This result has quite a profound consequence. Indeed, time dependence of a state vector can be considered as a transformation of a vector j˛ .t0/i defined at some initial instant of time t0 into another vector j˛ .t/i at time t under the action of an operator:

4.1 Schrödinger Picture 97

j˛ .t/i D OU .t; t0/ j˛ .t0/i : (4.2)

In order to keep the norm of the vector unchanged, the operator OU .t; t0/ must be unitary, which significantly limits the class of operators that can be used to describe the dynamics of quantum states. It also must obey an obvious condition:

OU .t0; t0/ D OI: (4.3)

Now, consider an evolution of the system from state j˛ .t0/i to state j˛ .t1/i and then to state

ˇ̌ ˛ � tf �˛

, which can be described as

j˛ .t1/i D OU .t1; t0/ j˛ .t0/iˇ̌ ˛ � tf �˛ D OU �tf ; t1

� j˛ .t1/i :

I can also describe a system’s dynamics from the initial state to the final, bypassing the intermediate state:

ˇ̌ ˛ � tf �˛ D OU �tf ; t0

� j˛ .t0/i :

Comparing this with the first two lines of the previous equation, you can infer an important property of the time-evolution operator OU:

OU �tf ; t0 � D OU �tf ; t1

� OU .t1; t0/ : (4.4)

An important corollary of Eq. 4.4 is obtained by setting tf D t0, which yields

OU .t0; t1/ OU .t1; t0/ D OI ) OU .t0; t1/ D OU�1 .t1; t0/ (4.5)

where I also used Eq. 4.3. In other words, the reversal of time in quantum dynamics is equivalent to replacing the time-evolution operator with its inverse. This idea can also be expressed by saying that by inverting the time-evolution operator, you describe the evolution of the system from present to the past. This property can also be described as reversibility of quantum dynamics: taking a system from t0 to tf and back brings the system in its original state completely reversing its initial evolution.

Now, let me consider the action of OU .t1; t0/ over an infinitesimally small time interval t0; t1 D t0 C dt. Expanding this operator over the small interval dt and using Eq. 4.3, I can write:

OU .t0 C dt; t0/ D OI C OGdt (4.6)

where OG � d OU=dt ˇ̌ ˇ tDt0

is an operator obtained by differentiating the time-evolution

operator with respect to time. Inverse to the operator defined by Eq. 4.6 can be found

98 4 Unitary Operators and Quantum Dynamics

by expanding function .1C x/�1 with respect to x and keeping only linear in x terms: .1C x/�1 ' 1 � x. Applying this to operator

�OI C OGdt ��1

, I get

OU�1 .t0 C dt; t0/ D OI � OGdt:

At the same time, Hermitian conjugation of Eq. 4.6 returns

OU� .t0 C dt; t0/ D OI C OG�dt:

Since the time-evolution operator is unitary ( OU�1 D OU�), operator OG has to be anti- Hermitian: OG� D � OG, so that it can be presented as OG D �i OH=„ (see Eq. 3.30), where OH is a Hermitian operator and „ is introduced to ensure that OH has the dimension of energy. Indeed, since the time-evolution operator is dimensionless, it is clear that operator OG has the dimension of inverse time. The dimension of the Planck’s constant is that of energy multiplied by time, so it is clear that OH has indeed the dimension of energy. This simple analysis leads the way to the next postulate of quantum theory.

Postulate 6 Hermitian operator OH in the expansion of the time-evolution operator is the operator version of Hamiltonian function of classical mechanics.

Thus, Eq. 4.6 can now be rewritten as

OU .t0 C dt; t0/ D OI � i OH „ dt: (4.7)

Taking advantage of the composition rule, Eq. 4.4, I can write:

OU .t C dt; t0/ D OU .t C dt; t/ OU .t; t0/ D

OI � i OH „ dt

! OU .t; t0/ D OU .t; t0/ � i

OH „

OU .t; t0/ dt

where I also used Eq. 4.7. The main difference between this last expression and Eq. 4.7 is that t in the latter can be separated from t0 by a finite interval. The last equation can be rewritten in the form of differential equation:

d OU .t; t0/ dt

D �i OH „

OU .t; t0/ : (4.8)

Applying Eq. 4.7 to Eq. 4.2, I can also derive:

j˛ .t C dt/i D

OI � i„ OHdt

j˛ .t/i ) j˛ .t C dt/i � j˛ .t/i dt

D � i„ OH j˛ .t/i

4.1 Schrödinger Picture 99

which can be rewritten in a standard form

i„d j˛i dt

D OH j˛i (4.9)

called Schrödinger equation. As any differential equation, Eq. 4.9 has to be com- plemented by an initial condition specifying the state of the system at an arbitrary chosen initial time. Given the role of Hamiltonian in classical mechanics discussed in Sect. 3.1, it is not very surprising that the same quantity (in its operator reincarnation) determines the dynamics of quantum systems as well.

4.1.2 Stationary States

If Hamiltonian does not contain explicit time dependence, which might appear, for instance, if an atom interacts with a time-dependent electric field E.t/, Eq. 4.9 has a very simple formal solution:

j˛.t/i D exp

�i OH „ t !

j˛0i ; (4.10)

where j˛0i is the state of the system at time t D 0. For practical calculations, however, this solution is not very helpful because the action of the exponent of an operator on an arbitrary vector in general is not easy to compute. Situation becomes much simpler if the initial state is presented by one of the eigenvectors of the Hamiltonian. If j˛0i D j�ni, where j�ni is an eigenvector of OH with respective eigenvalue En

OH j�ni D En j�ni ; (4.11)

the right-hand side of Eq. 4.10 can be computed as follows:

j˛.t/i D exp

�i OH „ t !

j�ni D 1X

mD0

1

mŠ

�it „ m

OHm j�ni D

1X mD0

1

mŠ

�it „ m

Emn j�ni D exp

�iEn„ t

j�ni ; (4.12)

where I used the definition of the exponential function of an operator, Eq. 3.21, and the fact that

OHm j�ni D Emn j�ni ;

which is easily proved. (You will have a chance to prove it when doing your homework.) Thus, if a system is initially in a state represented by an eigenvector of

100 4 Unitary Operators and Quantum Dynamics

the Hamiltonian, it remains in this state forever and ever. The time-dependent factor in this case is a complex number with absolute value equal to unity (pure phase as physicists like to say) and does not affect, therefore, any measurable quantities. Indeed, consider, for instance, an expectation value of some generic operator OT when a system is in the state described by Eq. 4.12:

h˛.t/j OT j˛.t/i D exp

i En „ t

h�nj OT j�ni exp

�iEn„ t

D h�nj OT j�ni :

The eigenvector equation 4.11 is often called time-independent Schrödinger equation, and it’s solutions represent the very same stationary states, which were postulated by Bohr, whose existence was proven in Davisson–Germer experiments mentioned in the Introduction, and which became the main object of the Schrödinger wave mechanics. The corresponding eigenvalues are called energy levels or simply energies. Here I will use the term stationary states to designate solutions of the time-independent Schrödinger equation with exponential time dependence attached to the eigenvectors and given by Eq. 4.12. As you just saw, this time dependence does not affect experimentally observable quantities, which remain independent of time, justifying the name “stationary” for these states.

I need to point out at a general ambiguity of relation between quantum states and vectors representing them: the latter are always defined with accuracy to a phase, meaning that all vectors can be multiplied by a complex number of unit magnitude without affecting any physical results. This is obvious from Eq. 4.10, which does not change upon multiplying the state vector by any constant factor. But since it is required that the states are normalized, this constant factor is limited to have a magnitude equal to unity, i.e., to be a pure phase. However, in the case presented in Eq. 4.12, the multiplying factor is not constant and, therefore, cannot be simply dismissed making it physically significant. This significance manifests itself, however, only when we have to deal with several stationary states. Indeed, since energy is always defined with an accuracy up to a constant factor, one can always make the energy eigenvalue corresponding to any one of stationary states to vanish killing thereby the time dependence of the corresponding stationary state. This vanishing trick, however, can be achieved only for one state, while all others will retain their exponential factor albeit with different energy values equal to the difference between their initial values and the one you chose to be equal to zero.2

In order to demonstrate that this general property of energies retain its meaning in quantum theory as well, I will consider a state evolving from a superposition of two eigenvectors of a Hamiltonian with different eigenvalues. Thus, assume that the initial state of the system is

j˛0i D a1 j�1i C a2 j�2i :

2Technically this can be achieved by subtracting one of the energy eigenvalues from the potential appearing in the Hamiltonian, which is equivalent to simple change of the zero level of the energies.

4.1 Schrödinger Picture 101

Using linearity of the time-evolution operator and results from Eq. 4.12, I can easily compute:

j˛.t/i D a1 exp

�i OH „ t !

j�1i C a2 exp

�i OH „ t !

j�2i D

a1 exp

�iE1„ t

j�1i C a2 exp

�iE2„ t

j�2i D

exp

�iE1„ t

a1 j�1i C a2 exp

�iE2 � E1„ t

j�2i :

(4.13)

Equation 4.13 shows that an initial vector in the form of a superposition of two eigenvectors of a Hamiltonian evolves by “dressing up” each of the initial state with the exponential time factor containing the energy eigenvalue corresponding to the respective eigenvector. However, the absolute values of these energy eigenvalues are again not important as the dynamics of the state is determined by the difference between them. I emphasized this point in the last line of Eq. 4.13 by factoring out one of the time-dependent exponential factors. It is clear that the overall phase factor will again disappear from all experimentally relevant expressions, and the entire time dependence will be determined by exp

��i E2�E1„ t � . Apparently, it would not

matter for this dynamic if I factored out the other exponential factor. To illustrate this point, I will now compute an expectation value of some generic operator with the state described by Eq. 4.13:

h˛.t/j OT j˛.t/i D exp

i E1 „ t

a�1 h�1j C a�2 � �2 exp

i E2 � E1

„ t ˇ̌ ˇ̌

OT

a1 j�1i C a2 exp

�iE2 � E1„ t

j�2i

exp

�iE1„ t

ja1j2 T11 C ja2j2 T22C

T12a � 1a2 exp

�iE2 � E1„ t

C T21a�2a1 exp

i E2 � E1

„ t

where Tij D h�ij OT ˇ̌ �j ˛ . Taking into account that for Hermitian operators diagonal

elements are real-valued and nondiagonal are complex conjugates of their trans- posed elements (Tij D T�ji ), this can be written down as

h˛.t/j OT j˛.t/i D ja1j2 T11 C ja2j2 T22C

2 jT12j ja1j ja2j cos

E2 � E1 „ t C ıT21 C ıa1 � ıa2

; (4.14)

102 4 Unitary Operators and Quantum Dynamics

where ıs are phases of elements appearing in the corresponding subindexes. This expression is explicitly real and is periodic with frequency dependent on the difference of energies .E2 � E1/ =„. I will leave it to you as an exercise to demonstrate that this result wouldn’t change if you factor out exp

� i E2„ t

� instead

of exp � i E1„ t

� .

In a general case, expanding an arbitrary initial state vector in the basis of the eigenvectors of the Hamiltonian, you can see that a time dependence of the vector representing the state of the system is obtained by adding the corresponding exponential factors exp

��i En„ t �

in front of each j�ni term in this expansion:

j˛.t/i D X

n

an exp

�iEn„ t

j�ni : (4.15)

Expansion coefficients an are determined by the initial state j˛0i with the help of Eq. 2.24. Equation 4.15 essentially solves the problem of quantum dynamics provided one knows eigenvalues and eigenvectors of the system’s Hamiltonian. For this reason solving the time-independent Schrödinger equation is one of the main technical problems in quantum theory. Respectively, much of this text as well as of all other books on quantum mechanics will be devoted to devising various ways of doing so.

If the Hamiltonian has a continuous spectrum of energy, the same idea for generating the time-dependent state from an initial state still works. One only needs to replace the sum over the discrete index in Eq. 4.15 by an integral over a relevant continuous quantity k labeling states of the system to get

j˛.t/i D ˆ

dka.k/ exp

�iEk„ t

j�ki : (4.16)

Coefficients a.k/ are again determined by an initial state in exactly the same way as in the discrete case (you will be well advised to remember though that the operational definitions of the inner product can be very different in the discrete and continuous cases).

4.1.3 Ehrenfest Theorem and Correspondence Principle

I want to finish the discussion of the Schrödinger picture by deriving the so- called Ehrenfest theorem, which is concerned with the dynamics of the expectation value of a generic Hermitian operator OA.t/, which might have its own explicit time dependence. Assuming that the system is in state j˛.t/i, I will derive a differential equation for quantity

D OA.t/ E

D h˛.t/j OA j˛.t/i, where D OA.t/

E is a frequently used

shortened notation for the expectation values. This expression can be differentiated using standard rules for differentiation of a product of several functions:

4.1 Schrödinger Picture 103

d D OA.t/

E

dt D d h˛.t/j

dt OA j˛.t/i C h˛.t/j @

OA @t

j˛.t/i C h˛.t/j OAd j˛.t/i dt

D

i

„ h˛.t/j OH OA j˛.t/i � i„ h˛.t/j

OA OH j˛.t/i C h˛.t/j @ OA @t

j˛.t/i D

� i„ Dh OA; OH

iE C * @ OA @t

+ : (4.17)

In the Schrödinger picture, the operators are devoid of their own dynamics. The time derivative in the last term of the Ehrenfest theorem takes into account a possibility of an external time dependence of an operator, which is not related to their internal dynamics. This explicit time dependence is a reflection of the changing environment of the system, such as a time-dependent electromagnetic field interacting with an atom. If operator OA does not have such an externally imposed time dependence, then the last term in Eq. 4.17 vanishes, and the dynamics of the expectation value of the observable represented by OA is completely determined by its commutator with the Hamiltonian.

There is a special class of observables, whose operators commute with Hamilto- nian. You already know that such observables are compatible with Hamiltonian, i.e., they would have a definite value if the system is in one of its stationary states. Ehrenfest theorem shows that such observables have an additional property— regardless of the state of the system, their expectation values do not depend on time. In other words, the expectation values of observables whose operators commute with the Hamiltonian are conserving quantities.

Finally, I would like you to note a remarkable similarity between the Ehrenfest theorem and Eq. 3.8 expressing time derivative of a classical function of coordinate and momentum in terms of its Poisson brackets with the classical Hamiltonian: the two equations become identical if one makes a substitution f� � � g ! � .i=„/ Œ� � � �. This similarity is physically significant as illustrated by the following example. Let me apply Ehrenfest theorem to a very special but extremely important case of coordinate and momentum operators of a single particle described by a time- independent Hamiltonian, like the one given in Eq. 3.48. For simplicity I will limit myself to a one-dimensional case, so that I will only need to consider one component of the position and momentum operators and can treat the potential energy as a function of a single coordinate only. The Ehrenfest theorem involves commutators of the respective operators with the Hamiltonian. In the case under consideration, I have to compute ŒOx; OH� and ŒOpx; OH� for OH given by the one-dimensional version of Eq. 3.48, which I reproduce below for your convenience:

OH D Op 2 x

2m C V.Ox/:

104 4 Unitary Operators and Quantum Dynamics

The easiest commutator to compute is ŒOx; OH�: h Ox; OH

i D

Ox; Op 2 x

2m

� D i„

m Opx (4.18)

where I used the fact that Ox commutes with V.Ox/ as well as identity 3.24 and canonical commutation relation, Eq. 3.47. It takes a bit more labor to compute ŒOp; OH� D ŒOp;V.Ox/�. In order to evaluate this commutator, I first present the potential energy as a power series:

V.Ox/ D 1X

nD0

1

nŠ

dnV

dxn Oxn

so that I can write

ŒOpx;V.Ox/� D 1X

nD0

1

nŠ

dnV

dxn ŒOpx; Oxn� :

Again using identity 3.24 to evaluate commutator ŒOp; Oxn�, I get

ŒOp; Oxn� D �i„nOxn�1;

substitution of which into the previous equation yields

ŒOpx;V.Ox/� D �i„ 1X

nD1

1

.n � 1/Š dnV

dxn Oxn�1 � �i„

1X nD0

1

nŠ

dnC1V dxnC1

Oxn:

Here I took into account that n D 0 term of the initial series is a constant and vanishes upon the differentiation. Correspondingly the summation in the middle expression above begins with n D 1. In the next step, I changed the dummy index of summation n � 1 ! n, turning n D 1 term into n D 0; n D 2 into n D 1; and so on. This process naturally forces to replace n � th derivative with n C 1th and Oxn with OxnC1. All what is now left is to recognize that the final resulting power series is the expansion of the derivative of function V.x/:

dV

dx D

1X nD0

1

nŠ

dnC1V dxnC1

Oxn:

Thus, I can proudly present

ŒOpx;V.Ox/� D �i„dV dx : (4.19)

4.1 Schrödinger Picture 105

Now, the Ehrenfest theorem for these two operators becomes

d hOxi dt

D hOpxi m

d hOpxi dt

D � �

dV

dx

� :

Repeating the same calculations for all three components of the position and the momentum vectors, you can easily obtain the three-dimensional version of these equations:

d hOri dt

D hOpi m

(4.20)

d hOpi dt

D � hrVi (4.21)

where rV (in case you forgot) is a gradient of V defined in the Cartesian coordinates with unit vectors ex, ey, and ez in the direction of the corresponding axes X;Y , and Z as

rV D ex @V @x

C ey @V @y

C ez @V @z :

The obtained equations resemble classical Hamiltonian equations, but it actually would be wrong to say (as many textbooks do) that Ehrenfest equations make expectation values of position and momentum operators to behave like correspond- ing classical quantities. In reality these equations do not even constitute a closed system of equations, which becomes almost obvious once you realize that generally speaking hV .Ox/i ¤ V .hOxi/. Equality here is realized only if the potential energy is either a linear or a quadratic function of the coordinates. In the former case �d OV=dx D F D const, so that the Ehrenfest equations have a simple solution:

hOpi D p0 C FtI hOxi D x0 C . p0=m/ t C .1=2/ .F=m/ t2;

reproducing classical equations for a particle moving with constant acceleration. In the case of a quadratic potential (harmonic oscillator)

�d OV=dx D kOx

so that

� D d OV=dx

E D k hOxi

106 4 Unitary Operators and Quantum Dynamics

reducing the Ehrenfest equations to classical equations describing a harmonic oscillator. To illustrate the difficulty arising in a more general situation, consider V.x/ D ax3=3. In this case Ehrenfest equations become

d hOxi dt

D hOpi m

d hOpi dt

D �a ˝x2˛ :

Since ˝ x2 ˛ ¤ hxi2, the resulting system of equations is not complete, because now

you need to derive a separate equation for ˝ x2 ˛ . Trying to do so (see the exercises)

will appear as a recurring nightmare—you will end up with new variables at each step, and this process will never end. You might wonder, of course, that maybe it is possible to find such a state for which hOxni D hOxin, in which case Ehrenfest equations will literally coincide with the Hamiltonian equations. It is not very difficult to prove that the only state in which this might be true is the state represented by the eigenvector of the coordinate operator. Unfortunately, even if at some time t D 0 you can create a system in such a state, it will lose this property as it evolves in time. To see that this is indeed the case, imagine a state jx.t/i such that Ox jx.t/i D x.t/ jx.t/i and try to plug it in Eq. 4.9 describing the dynamics of quantum states. You will immediately see that since the coordinate and momentum operators do not commute, this state cannot be a solution of the time-dependent Schrödinger equation.

There is, however, another way to make Ehrenfest equations identical to their classical Hamiltonian counterparts. All what you need to do is to neglect quantum uncertainty of coordinate and momentum. Since technically these uncertainties arise from the canonical commutation relation, you can do away with them by passing to the limit „ ! 0. The emergence of Hamiltonian equations, in this so-called classical limit, is a very attractive and soothing feature of quantum formalism indicating that the developed theory adheres to the correspondence principle formulated (again!) by Niels Bohr. This principle played an important heuristic and philosophical role in the development of quantum theory. It states that the quantum theory must reproduce the results of classical physics in situations where classical physics is known to be valid. Even though the concrete mathematical expressions defining situations when the quantum description must reduce to the classical one vary from phenomenon to phenomenon, they all involve taking the limit „ ! 0. In this limit, for instance, the quantum of energy „! introduced by Planck vanishes, or de Broglie wavelength � D h=p goes to zero, and quantum uncertainties of various observables, which prevented you from replacing hOxni ! hOxin, disappear.

4.2 Heisenberg Picture 107

4.2 Heisenberg Picture

As I mentioned in the beginning of this chapter, quantum dynamics can be described by imposing a time dependence on operators rather than on the states. This approach, a version of which was designed by Heisenberg, Born, and Jordan, was historically first, is directly connected with classical Hamiltonian equations, and is quite popular in current research literature on quantum mechanics, especially in quantum optics. However, for some reasons it rarely appears in undergraduate texts on quantum theory. Probably, it is believed that the idea of time-dependent operators is too complicated for infirm minds of undergraduate physics majors to comprehend, but I personally do not see why this must be the case. So, let’s try to remove the veil of mystery from this alternative version of quantum theory, called the Heisenberg picture.

At first glance, the idea of time-dependent operators seems indeed quite strange: if an operator is, e.g., a prescription to differentiate a function, how can this rule change with time? The best way to answer this type of question is to first develop a formal way to describe the time dependence of operators and, then, to illustrate it using a few simple examples.

I begin by considering an expectation value of some arbitrary operator OA in state j˛.t/i: h˛.t/j OA j˛.t/i. Using the time-evolution operator defined in Eq. 4.2, I can present this expectation value as

hA.t/i D h˛0j OU�.t; 0/ OA.t/ OU.t; 0/ j˛0i : (4.22)

Lumping together all three operators appearing between the bra and ket vectors in Eq. 4.22 into a new operator

OAH.t/ D OU�.t; 0/ OA.t/ OU.t; 0/ (4.23)

yields a time-dependent Heisenberg representation of the initial operator. This time dependence has, in general, two sources: an external time dependence of the initial Schrödinger operator discussed in the previous section and an internal time dependence responsible for the quantum dynamics of the system represented by the time-evolution operators. Differentiating this equation with respect to time, I obtain

d OAH.t/ dt

D OdU�.t; 0/

dt OA.t/ OU.t; 0/C

OU�.t; 0/ OA.t/d OU.t; 0/ dt

C OU�.t; 0/@ OA.t/ @t

OU.t; 0/:

108 4 Unitary Operators and Quantum Dynamics

Using Eq. 4.8 this can be rewritten as

d OAH.t/ dt

D i„ OU�.t; 0/ OH OA.t/ OU.t; 0/�

i

„ OU�.t; 0/ OA.t/ OH OU.t; 0/C OU�.t; 0/@

OA.t/ @t

OU.t; 0/:

Taking advantage of the unitarity of the time-evolution operator, I insert combina- tion OU OU� � OI between the Hamiltonian and operator OA in the first two terms of the equation above. This procedure yields

d OAH.t/ dt

D i„ OU�.t; 0/ OH OU„ ƒ‚ … OU

� OA.t/ OU.t; 0/„ ƒ‚ …�

i

„ OU�.t; 0/ OA.t/ OU„ ƒ‚ … OU

� OH OU.t; 0/„ ƒ‚ …C OU �.t; 0/

@ OA.t/ @t

OU.t; 0/ „ ƒ‚ …

where each of the bracketed terms defines, according to Eq. 4.23, a Heisenberg representation of the corresponding operator. Therefore, the last equation can be rewritten as

d OAH.t/ dt

D � i„ h OAH; OHH

i C @

OAH.t/ @t

: (4.24)

The resulting equation is called the Heisenberg equation for time-dependent operators. It looks very much like the Ehrenfest theorem, and just like the latter, it resembles the classical Eq. 3.8. However, unlike Ehrenfest theorem, the Heisenberg equation describes the time evolution of operators rather than of the expectation values and, therefore, does not suffer from perpetual emergence of new variables. The Ehrenfest theorem can be obtained from Eq. 4.24 by computing the expectation values of both sides of this equation with an initial state j˛0i. The initial condition for the Heisenberg equation can be easily ascertained from Eq. 4.23: setting t D 0 in this equation immediately yields that the initial conditions for Heisenberg operators are given by the corresponding Schrödinger operators, establishing an intimate connection between the two pictures. Now you can answer the question posed in the beginning of this section: how can a rule representing an operator change with time? The time dependence comes from combinations of various immutable rules with time-dependent coefficients. You will see the example of this a few paragraphs below.

Hamiltonian appearing in Eq. 4.24 is the Heisenberg representation of the regular Schrödinger Hamiltonian and must be evaluated before Heisenberg equations can be used. However, in the special important case of a time-independent Schrödinger Hamiltonian, one can easily show that OHH � OH. Indeed, one can easily infer from

4.2 Heisenberg Picture 109

Eq. 4.10 that the time-evolution operator for time-independent Hamiltonian is

OU.t; 0/ D exp � �i OHt=„

� : (4.25)

This operator obviously commutes with Hamiltonian, and as a result we have

OHH D OU� OH OU D OU� OU OH D OH:

The same arguments apply to any Schrödinger operator commuting with Hamilto- nian, so all such operators remain independent of time. This result also obviously follows from Eq. 4.24. Thus, one can say that operators commuting with Hamilto- nian represent quantum conserving observables not only at the level of expectation values as in the Ehrenfest theorem but at a deeper level of operators themselves.

In the case of Hamiltonians with explicit time dependence, you can no longer claim that the Heisenberg representation of the Hamiltonian coincides with the Schrödinger one. The Heisenberg picture in this case loses its immediate appeal, and people often prefer a picture, intermediate between Schrödinger’s and Heisenberg’s, called interaction representation. In this representation the Hamiltonian is divided into time-independent and time-dependent parts:

OH.t/ D OH0 C OV.t/:

The transition to new operators is carried out now using the time-evolution operator OU.t; 0/ D exp

� �i OH0t=„

� . As a result one ends up with both operators and the

state of the system displaying dependence of time: the dynamics of the operators is defined by operator OH0 and the dynamics of the states by OV.t/. However, something tells me that continuing with this line of thought would bring me way over the line allowed in the undergraduate course. So, consider this as a teaser and preview of things to come if you decide to deepen your knowledge of quantum theory.

Now back to the time-independent Hamiltonians. The same calculations as in the case of Ehrenfest equations yield the following Heisenberg equations for the one-dimensional motion of a quantum particle:

dOx dt

D Opx me

(4.26)

d Opx dt

D �d OV

dx (4.27)

which coincide with the respective Hamiltonian equations of classical mechanics. Again just like in the case of Ehrenfest theorem, these equations can be easily generalized for the three-dimensional case:

110 4 Unitary Operators and Quantum Dynamics

dOr dt

D Op me

(4.28)

d Op dt

D �r OV: (4.29)

To illustrate how Heisenberg equations work, let me consider the case of a one- dimensional harmonic oscillator—a particle moving in a quadratic potential of the form V D 1

2 m!2x2, in which case Eq. 4.27 becomes

d Opx dt

D �m!2 Ox: (4.30)

Differentiation of this equation with respect to time yields a differential equation of the second order:

d2 Opx dt2

D �!2 Opx

where the term dOx=dt is replaced with Op=m with the help of Eq. 4.26. It is straightforward to verify that the equation for the momentum operator is solved by

Op.t/ D O�1 cos!t C O�2 sin!t (4.31)

while an expression for the coordinate operator is obtained from Eq. 4.30 by simple differentiation:

Ox.t/ D O�1 m!

sin!t � O�2 m!

cos!t: (4.32)

Unknown operators O�1;2 in Eqs. 4.31 and 4.32 are to be determined from the initial conditions. (General solution of any linear differential equation is a combination of particular solutions with undefined constant coefficients. Since we are dealing with operator equations, these unknown coefficients must also be operators.) Substituting t D 0 in the found solutions, you can see that

O�1 D Opx0 O�2 D �m! Ox0

so that, just as I advertised, the time-dependent momentum and coordinate operators are expressed as linear combinations of Schrödinger operators Opx0 and Ox0 with time- dependent coefficients

4.3 Problems 111

Op.t/ D Opx0 cos!t � m! Ox0 sin!t

Ox.t/ D Opx0 m!

sin!t C Ox0 cos!t: (4.33)

The obtained solution for the Heisenberg operators looks identical to the solution of the classical harmonic oscillator problem, but do not get deceived by this similarity. For instance, in classical case one can envision initial conditions such that either xo or px0 (not both, of course) are zeroes. In quantum case these are operators and cannot be set to zero. At this point you do not know enough about properties of the quantum harmonic oscillator to analyze this result any further, so I will postpone doing this till later. Still, we can have a bit of fun and, as an exercise, calculate the commutator between operators Op.t/ and Ox.t/, taken not necessarily at the same time. Example 15 (Commutator of Heisenberg Operators for Harmonic Oscillator)

ŒOx.t1/; Op.t2/� D sin!t1 cos!t2 Opx0

m! ; Opx0

� C cos!t1 cos!t2 ŒOx0; Opx0��

sin!t1 sin!t2 ŒOpx0; Ox0� � sin!t2 cos!t1 ŒOx0;m! Ox0� D i„ cos!t1 cos!t2 C i„ sin!t1 sin!t2 D i„ cos Œ! .t1 � t2/�

It is interesting to note that this commutator depends only on the time interval t1� t2 and not on t1 and t2 separately. The equal time commutator (t1 D t2/ coincides with the canonical commutator for Schrödinger coordinate and momentum operators.

It is also fun to think about eigenvectors and eigenvalues of the Heisenberg operators in Eq. 4.33, but I will let you play this game as an exercise.

4.3 Problems

Section 4.1.1

Problem 45 Consider a Hamiltonian presented by a 2 � 2 matrix

OH D „!

cos � sin � exp .i'/ sin � exp .�i'/ � cos �

� :

1. Find Hermitian conjugate and inverse matrix and convince yourself that this operator is simultaneously Hermitian and unitary.

2. Using representation of an exponential function as a power series, evaluate the time-evolution operator for this Hamiltonian.

112 4 Unitary Operators and Quantum Dynamics

3. Assume that the initial state of the system is given by a vector

j˛0i D 1p 2

" 1

1

#

and find j˛.t/i using the time-evolution operator.

Section 4.1.2

Problem 46 Prove that if OH j�ni D En j�ni, then OHm j�ni D Emn j�ni. Problem 47 Consider a system with Hamiltonian

OH D Op 2

2me C V.r/:

Assume that you know its eigenvalues and eigenvectors j�ni and En. Show that if you change the potential in this Hamiltonian to V.r/ � E0, all the eigenvectors will stay the same, the eigenvalue E0 will become equal to zero, and all other eigenvalues will become En � E0. Problem 48 Re-derive Eq. 4.14 factoring out exp

� i E2„ t

� instead of exp

� i E1„ t

� , and

demonstrate that this result does not change.

Problem 49 Consider a system described by a Hamiltonian

OH D E0 " 1 ia

�ia 1

# :

1. Find stationary states of this Hamiltonian. 2. Assuming that at t D 0 the system is in the state

j˛0i D 1p 2

" 1

i

# ;

find j˛.t/i using stationary states of the Hamiltonian.

Section 4.1.3

Problem 50 Prove that h˛j Ox2 j˛i D .h˛j Ox j˛i/2 if and only if Ox j˛i D x j˛i. It is not very difficult to prove that Ox j˛i D x j˛i implies h˛j Ox2 j˛i D .h˛j Ox j˛i/2, but

4.3 Problems 113

the proof of the opposite statement requires a bit more ingenuity. You can try to prove it by demonstrating that if Ox j˛i ¤ x j˛i, then h˛j Ox2 j˛i cannot be equal to .h˛j Ox j˛i/2. Problem 51 Go back to the problem involving a one-dimensional motion of a particle in the cubic potential OV D aOx3=3 discussed in Sect. 3.3.1. It has been shown in the text that the Ehrenfest equation for hOpi involves the expectation value of ˝x2˛. Derive the Ehrenfest equation for this quantity. Do you see expectation values of any new operators or a combination of operators in the equation for

˝ x2 ˛ ? Derive

Ehrenfest equations for those new quantities. Comment on the results.

Section 4.2

Problem 52 You will learn in the following section that quantum states can be described by functions of coordinates—wave functions, in which case Schrödinger momentum operator becomes

Opx0 .x/ D �i„d dx

while the coordinate operator becomes simple multiplication by the coordinate Ox0 .x/ D x .x/. Using this form of time-independent operators, find the functions representing eigenvectors of their time-dependent Heisenberg counterparts:

Op.t/ D Opx0 cos!t � m! Ox0 sin!t

Ox.t/ D Opx0 m!

sin!t C Ox0 cos!t:

Analyze the behavior of these eigenvectors as functions of time; especially, consider limits t D �n and t D �=2C�n, where n D 0; 1; 2 � � � . Hint: Time-dependent terms here are just parameters, and their time dependence does not affect how you shall solve the respective differential equations.

Problem 53 Derive Heisenberg equations for operators Oa, Oa� and Ob, Ob� appearing in the following Hamiltonian:

OH D „! Oa� Oa C „�Ob� Ob C �

Oa� Ob C Ob� Oa � :

Commutation relations for these operators are as follows:

�Oa; Oa�� D 1I hOb; Ob�

i D 1I

h Oa; Ob�

i D 0I

h Oa; Ob i

D 0:

Chapter 5 Representations of Vectors and Operators

5.1 Representation in Continuous Basis

We have managed to get through four chapters of this text without specifying any concrete form of the state vectors, and treating them as some abstractions defined only by the rules of the games that we could play with them. This approach is very convenient and rewarding from a theoretical point of view as it emphasizes the generality of quantum approach to the world and allows to derive a number of important general results with relative ease. However, when it comes to responding to experimentalists’ requests to explain/predict their quantitative experimental results, we do need to have something a bit more concrete and tangible than the idea of an abstract vector. The similar situation actually arises also in the case of our regular three-dimensional geometric vectors. It is often convenient to think of them as purely geometrical objects (arrows, for instance) and derive results independent of any choice of coordinate system. However, at some point, eventually, you will need to get to some “down-to-earth” computations, and to carry them out, you will have to choose a coordinate system and replace the “arrows” with a set of numbers—the vector components.

In the case of abstract vectors that live in an abstract linear vector space, you can use the same idea to get a more concrete and handy representation of the quantum states. All these representations require that we use a basis in our abstract space. It seems more logical to begin with representations based on discrete bases, but in reality, we are somewhat forced to start with continuous bases. The reason for this is that two main observables in quantum mechanics, from which almost all of them can be constructed, are the position and the momentum (see Sect. 3.3.2). Operators corresponding to these observables have continuous spectrum, and, therefore, you will have to learn how to represent these operators using continuous bases.

© Springer International Publishing AG, part of Springer Nature 2018 L.I. Deych, Advanced Undergraduate Quantum Mechanics, https://doi.org/10.1007/978-3-319-71550-6_5

115

116 5 Representations of Vectors and Operators

5.1.1 Position and Momentum Operators in a Continuous Basis

Let me begin with some abstract operator OC and a continuous basis formed by orthonormalized vectors jqi:

hqj q0˛ D ı �q � q0� ;

where q is a continuously changing parameter. You already know from Sect. 2.3 that an abstract ket vector j˛i can be presented as an integral:

j˛i D ˆ

dq'˛.q/ jqi : (5.1)

Hermitian conjugation of this expression produces its bra counterpart:

h˛j D ˆ

dq'�̨.q/ hqj : (5.2)

I can now write down the inner product between vectors j˛i and jˇi as

h˛j ˇi D ˆ

dq ˆ

dq0'�̨.q/'ˇ.q0/ hqj q0 ˛ D

ˆ dq

ˆ dq0'�̨.q/'ˇ.q0/ı

� q � q0� D

ˆ dq'�̨.q/'ˇ.q/: (5.3)

Subindexes ˛ and ˇ in these expressions indicate the correspondence between an abstract vector and the respective function appearing in the superposition given by Eq. 5.1. Applying Eq. 5.3 to the case when j˛i D jˇi, I reproduce Eq. 2.42, and by recalling that all state vectors must be normalized, I end up with condition

ˆ dq'�̨.q/'˛.q/ D 1 (5.4)

generalizing Eq. 2.43, which was originally derived only for the functions of coordinates. It should be noted here that while I am using a single variable q as an argument of the functions ' .q/, you must understand that it is just a convenient notation, and in reality, q can represent several variables. For instance, eigenvectors of the position operator depend on three components of the position vector, but we have been using the single symbol r to designate them all.

As long as we all agree on the choice of the basis, and do not change it in the middle of a conversation (or calculations), we have a one-to-one correspondence between abstract vectors and the respective superposition coefficients. This function '˛.q/ provides a complete description of the corresponding vector and can be, therefore, considered as its faithful representation. It can be expressed in terms of

5.1 Representation in Continuous Basis 117

vector j˛i and the basis vectors by premultiplying Eq. 5.1 by hq0j and using the orthonormality condition:

'˛.q/ D hqj ˛i : (5.5)

This essentially completes the discussion of the representation of vectors, but this was an easy part. You also need to learn how to find representation of operators appropriate for the developed representation of vectors, which is the hard part. For starters, I need to explain to you what it means to represent an operator. Consider an expression

jˇi D OQ j˛i

where two abstract vectors are related to each other by abstract operator OQ. It seems reasonable to define a representation of the operator as such an object that would yield the same relation between functions '˛.q/ and 'ˇ.q/ representing the corresponding abstract vectors j˛i and jˇi. In order to figure it out, let me try to insert the completeness condition for the continuous spectrum, Eq. 3.45, formed with basis vectors jqi into three places in jˇi D OQ j˛i, in front of vector jˇi, in front of operator OQ, and between the operator and j˛i:

ˆ dq jqi hq jˇi D

ˆ dq

ˆ dq0 jqi hqj OQ ˇ̌q0˛ ˝q0 j˛i : (5.6)

(This amounts to inserting unity operators in all places, so, obviously, I have not changed anything.) Using Eq. 5.5 I get

ˆ dq jqi'ˇ.q/ D

ˆ dq

ˆ dq0 jqi hqj OQ ˇ̌q0˛'˛.q0/ jqi ;

which can be rewritten as ˆ

dq jqi 'ˇ.q/ �

ˆ dq0 jqi hqj OQ ˇ̌q0˛'˛.q0/

� D 0:

Since the vectors of the basis are linearly independent, the integral in this expression can be zero only if the integrand is zero, which yields

'ˇ.q/ D ˆ

dq0 hqj OQ ˇ̌q0˛'˛.q0/ � ˆ

dq0Q.q; q0/'˛.q0/ (5.7)

where I introduced Q.q; q0/ D hqj OQ jq0i. Thus, the abstract operator OQ in the continuous basis takes the form of an integral operator with kernel hqj OQ jq0i. If we know how this operator acts on the basis vectors, we can determine the kernel and replace the abstract relation jˇi D OQ j˛i by an integral relation given by Eq. 5.7.

118 5 Representations of Vectors and Operators

For instance, you can easily find the kernel if the basis is formed by eigenvectors of operator OQ. Indeed, if you know that OQ jqi D q jqi, then Q.q; q0/ D hqj OQ jq0i D qı .q � q0/ and Eq. 5.7 simplifies to

'ˇ.q/ D ˆ

dq0Q.q; q0/'˛.q0/ D q'˛.q/; (5.8)

i.e., the integral operator is reduced to simple multiplication by a corresponding eigenvalue, and what can be simpler? Unfortunately, life is always more compli- cated, and in most cases, you will have to deal simultaneously with at least two non-commuting operators, so that eigenvectors of one operator will not be the eigenvectors of the other. To deal with this situation, you have to learn how to represent an operator in a basis formed by vectors that are not its eigenvectors.

Most of the bases used for practical calculations are formed by eigenvectors of some Hermitian operator, and recognition of this fact can be quite useful. So, in addition to the original operator OQ with eigenvectors jqi and eigenvalues q, let me introduce another operator OS with eigenvectors jsi and eigenvalues s. The goal now is to find an integral kernel representing operator OQ in the basis of vectors jsi. In other words, I need to rewrite expressions hsj OQ js0i in the basis formed by vectors jqi. It can be done by exploiting (twice) again the same old trick with the completeness relation expressed in terms of these vectors:

hsj OQ ˇ̌s0˛ D ˆ

dqdq0 hs jqi hqj OQ ˇ̌q0˛ ˝q0 ˇ̌ s0˛ :

Now, taking into account that kernel Q .q; q0/ in the basis of its own eigenvectors is Q .q; q0/ D qı .q � q0/, I can simplify the above expression into

Q.s; s0/ � hsj OQ ˇ̌s0˛ D ˆ

dqq hsj qi hqj s0˛ (5.9)

which gives me exactly what I have been looking for. For this expression to be useful, however, you would need to know functions �q.s/ D hsj qi and �s.q/ D hqj si. The first of them can be interpreted as the representation of eigenvector jqi in the basis of vectors jsi, and the second one is clearly a representation of jsi in the basis of jqi. These functions are related to each other by the property of the inner product described by Eq. 2.19: �q.s/ D ��s .q/. So, the whole business of finding the kernel Q.s; s0/ is now reduced to finding representation of eigenvectors of OQ in terms of those of OS (or vice versa).

Finding function �s.q/ is impossible without bringing in some additional information. It can be, for instance, a commutator between operators OQ and OS, or just outright expression for �s.q/ obtained empirically or heuristically, on the ground of some physical arguments, or just by divine insight. Whatever method you chose, you need to specify now which operators we are dealing with. Of biggest interest are, of course, operators of position and momentum, so let’s agree to identify operator OQ with x-component OPx of the momentum operator and OS with operator

5.1 Representation in Continuous Basis 119

OX—x-component of the position operator. In this case function �q.s/ becomes the coordinate representation �px.x/ of the eigenvectors of the momentum operator (its x-component, of course).

This function represents a state of the particle with definite momentum px, which, according to de Broglie hypothesis, corresponds to motion of a free particle and is described by a harmonic wave with the wave vector with x-component kx D px=„. Disregarding the time-dependent portion of such a wave, I can write its coordinate- dependent part as �px.x/ D a exp .ikxx/ D a exp .ipxx=„/. Choice a D 1=

p 2�„

generates, according to Eq. 2.36, a delta-normalized function:

�px.x/ D 1p 2�„ exp .ipxx=„/ : (5.10)

Indeed,

1

2�„

1̂

�1 ei. px�p0x/x=„dx D 1

2�

1̂

�1 ei. px�p0x/QxdQx D ı � px � p0x

� (5.11)

where Qx D x=„. Similarly, you can also find that

1

2�„

1̂

�1 ei.x�x0/px=„dpx D ı

� x � x0� : (5.12)

Now I can write Eq. 5.9 as

Px � x; x0

� D 1 2�„

1̂

�1 dpxpxe

ipxx=„e�ipxx0=„ D 1 2�„

1̂

�1 dpxpxe

i.x�x0/px=„:

This integral might look puzzling, because it is naturally diverging. Applying a magic trick, however, I can turn it into something that actually makes sense. The trick is quite popular, so it is useful to have it up your sleeve. Differentiation of Eq. 5.12 with respect to x produces

dı .x � x0/ dx

D i 2�„2

1̂

�1 dpxpxe

i.x�x0/px=„:

I hope that you have recognized the integral of interest on the right-hand side of this equation, so that you can find for Px .x; x0/:

Px � x; x0

� D „ i

dı .x � x0/ dx

: (5.13)

120 5 Representations of Vectors and Operators

Substituting Eq. 5.13 into Eq. 5.7 with variable q replaced with s (in that equation q was just a generic variable not yet identified with eigenvalues of the operator), I derive a relation between functions '˛.x/ and 'ˇ.x/ indicating states j˛i and jˇi in the representation of the eigenvectors of the position operator (position representation for brevity):

'ˇ.x/ D „ i

ˆ dx0

dı .x � x0/ dx

'˛.x 0/ D „

i

d

dx

ˆ dx0'˛.x0/ı

� x � x0� D �i„d'˛.x/

dx :

Thus, we see that the OPx operator in the position (coordinate) representation is equivalent to a differential operator:

Opx D �i„ d dx

(5.14)

where I used the lowercase letter for a particular representation of the operator as opposed to the uppercase used for abstract operators. The coordinate operator in the coordinate representation is obviously just an operator of multiplication by the coordinate’s eigenvalue.

You can turn this analysis around and identify OS with a component of the momentum operator and OQ with x-coordinate. Then �q.s/ becomes the momentum representation of the eigenvector of coordinate:

�x. px/ D h p jxi D .hx j pi/� D 1p 2�„ exp .�ipxx=„/ : (5.15)

Repeating all the same manipulation as before, you will end up with the coordinate representation of the coordinate operator in the form of a differential operator:

Ox D i„ d dpx

: (5.16)

Equations 5.14 and 5.16 are obtained from each other by interchanging x � px and complex conjugating the result. The complex conjugation bit is, of course, in sync with the fact that coordinate representation of the momentum’s eigenvectors and momentum representation of the coordinate’s eigenvectors are complex conjugates of each other. Obviously, all the same arguments can be carried out for any other Cartesian component of the position and momentum operators, which brings us to the following conclusion. The position representation of the momentum operator is

given by the coordinate gradient operator �!r as

Op D �i„�!r ; (5.17)

while the momentum representation of the position operator is

Or D i„�!r p (5.18)

5.1 Representation in Continuous Basis 121

where �!r p is defined as

�!r p D ex @ @px

C ey @ @py

C ez @ @pz

(5.19)

and ex;y;z are unit vectors of Cartesian coordinate system with axes X;Y , and Z. The functions representing the eigenvectors of the 3-D momentum operator in the position representation and of the 3-D position operator in the momentum repre- sentation are, obviously, obtained by multiplying their respective one-dimensional counterparts:

�r.p/ D 1 .2�„/3=2 e

�ip�r=„ (5.20)

�p.r/ D 1 .2�„/3=2 e

ip�r=„: (5.21)

You might be wondering how we ended up with OP and OR being represented by differential rather than by integral operators as was my original intention. It happened thanks to a singular nature of the kernels, hr0j OP jri and hp0j OR jpi, which turned out to be proportional to the derivative of the delta-function. And the delta- function derivatives are quite capable of turning integrals into derivatives, as it happened in this particular case.

Having found representations of the coordinate and momentum operators, you can easily compute their commutator. For instance, for the x-components in the coordinate representation, you will easily find

h OX; OPx i

f .x/ D �i„x d dx

f .x/C i„ d dx .xf .x// D i„f .x/ )

h OX; OPx i

D i„: (5.22)

Since this commutator is just a number, it must not depend on the particular representation (this is why I returned capital letters for the operators). Indeed, the calculations carried out in the momentum representation yield

h OX; OPx i

f . p/ D i„ d dp . pf . p// � i„p d

dp f . p/ D i„f . p/ )

h OX; OPx i

D i„

as expected. I will finish this section by deriving a relation between functions '˛.r/ and

Q'˛.p/ representing the same state j˛i correspondingly in the coordinate and momentum representations. To achieve this, I am again resorting to the magic of the completeness relation based upon the eigenvectors of momentum. Substitution of this relation into Eq. 5.5 adapted for the eigenvectors of the position operator gives

122 5 Representations of Vectors and Operators

'˛.r/ D hrj ˛i D ˆ

dp hrj pi hpj ˛i D ˆ

dp�r.p/ Q'˛.p/ (5.23)

D 1 .2�„/3=2

ˆ d3pe�ip�r=„ Q'˛.p/

where I used Eq. 5.20 for the momentum representation of the position operator’s eigenvector. One can easily invert Eq. 5.23 using Fourier representation of the delta- function, Eq. 2.36, to obtain

Q'˛.p/ D 1 .2�„/3=2

ˆ d3reip�r=„'˛.r/: (5.24)

5.1.2 Parity Operator

In this section I want to make a slight detour and define an important operator closely related to the eigenvectors of the position operator used to introduce the position representation of the state vectors. This operator, called parity operator, is often used to classify wave functions arising in this representation, so it seems quite appropriate to talk about it here.

The parity operator is defined by its action on the eigenvectors of the position operator jri as

O… jri D j�ri ; (5.25)

the operation often called inversion. It is easy to see that this operator is Hermitian:

˝ r0 ˇ̌ O… jri D ˝r0 j�ri D ı �r0 C r�

� hrj O… ˇ̌r0˛

�� D �hr ˇ̌�r0˛�� D ı �r0 C r�

and that it is equal to its inverse, O…2 jri D O… j�ri D jri ) O…2 D OI, where OI is the identity operator. It follows immediately from the last expression that O… D O…�1. It also means that this operator is unitary. The action of this operator on an arbitrary state can be defined using its position representation:

hrj O… j i D h�rj i D .�r/

where .r/ D hrj i. It is also important to know how this operator acts on the eigenvectors of the momentum operator, which can be found out using again the coordinate representation:

5.1 Representation in Continuous Basis 123

O… jpi D ˆ

dr hrj pi O… jri D ˆ

dr hrj pi j�ri D ˆ

dr h�rj pi jri :

To derive this result I used coordinate representation of jpi and changed the integration variable r to �r in the last integral. Finally, using Eq. 5.21 I can write h�rj pi D hrj �pi, which results in the following transformation rule for jpi:

O… jpi D j�pi : (5.26)

Parity operator has only two eigenvalues: 1 or �1. Indeed, assume that j�i is an eigenvector with eigenvalue � W O… j�i D � j�i. Apply the parity operator to this relation again: O…2 j�i D � O… j�i ” j�i D �2 j�i ) � D ˙1. Accordingly, eigenvectors of O… represent those states that either do not change upon inversion (we can call them even states) or those that change their sign (odd states). Obviously all even and all odd functions are coordinate representations of the eigenvectors of the parity operator.

Parity operator is one of the simplest symmetry operators, which means that it can be used to determine that a Hamiltonian or another operator corresponding to a quantum observable does not change, when a system is transformed in a certain way. In fancy language, this property is called invariance with respect to a certain transformation. To see why such invariance can be important, consider the time- independent Schrödinger equation:

OH j˛i D E j˛i

and assume that there is an operator (usually a unitary one), which can be used to describe a transformation of the system. Parity operator is one such example: it generates spatial inversion of the system with respect to an origin of the coordinate system. Rotations with respect to an axis or a point provide other examples of transformations described by unitary operators. In what follows I will use notation for the parity operator for the sake of concreteness, but most conclusions in the next paragraph will be applicable to any symmetry operator.

So, let me apply operator O… to the time-independent Schrödinger equation. In addition, I will also insert expression O…�1 O…, which is obviously equal to the identity operator, between OH and j˛i:

O… OH O…�1 O… j˛i D E O… j˛i :

The Schrödinger equation preserves its form when rewritten in terms of new vector Qj˛i D O… j˛i and new Hamiltonian OH0 D O… OH O…�1. This exercise demonstrates

that a relation between vectors and operators is preserved if a transformation of a vector is accompanied by the corresponding transformation of the operator. I can now give a formal definition of the invariance of a system, which I earlier loosely described by saying that “the system does not change” upon certain operation, i.e., the system is invariant under a transformation if its Hamiltonian obeys the

124 5 Representations of Vectors and Operators

following condition: OH0 D O… OH O…�1 D OH. One of the immediate consequences of this condition is that if j˛i is the eigenvector of the Hamiltonian, then Qj˛i D O… j˛i is also an eigenvector. This is an important conclusion, but I cannot dwell on it for too long as it will bring us way outside of our comfort zone. What is more important for us is that condition O… OH O…�1 D OH implies that the Hamiltonian and the transformation operator commute O… OH D O… OH. This information can be immediately put to use because we already know what this means—the transformation operator and the Hamiltonian have a common set of eigenvectors. Usually eigenvectors of the former are known, and this knowledge makes finding of the eigenvectors of the latter easier. For instance, if I prove that my Hamiltonian is invariant with respect to parity transformation, I can immediately conclude that all eigenvectors of the Hamiltonian are presented by either even or odd functions, which, as you will see in Sect. 6.2, significantly simplify their computation.

Hamiltonian is not the only operator whose behavior under parity transformation is of interest. Other operators worthy of our consideration are position and momen- tum operators. Let me begin with a position operator defined, as you well know, by Or jri D r jri. Performing the same manipulation with this expression as the one to which I just subjected Hamiltonian, I will have

O…Or O…�1 O… jri D r O… jri :

Using Eq. 5.25 I transform this into

O…Or O…�1 j�ri D r j�ri

which only makes sense if

O…Or O…�1 D �Or:

This result demonstrates that the position operator changes its sign upon inversion, which, after some reflection, appears as almost obvious. Operators which have this property are called “odd” as opposed to “even” operators, which do not change upon parity transformation. Obviously, the inversion-invariant operators are by definition “even.” I will leave it for you as an exercise to prove that the momentum operator is also “odd.”

5.1.3 Schrödinger Equation in the Position Representation

The position representation is the most popular in practical applications of quantum theory. This is the representation in which the original de Broglie matter waves were described and in which Schrödinger wrote his equation. Much of the classical physics deals with processes occurring in space and time, so it is not surprising

5.1 Representation in Continuous Basis 125

that the wave functions written in the position representation hold a special place in our hearts.1 It is also important, of course, that the potential energy operator, which might have quite elaborate position dependence, looks the simplest in the position representation. The momentum operator, on the other hand, does not have a significant multiplicity of the forms appearing mostly in kinetic energy as Op2 term, whose coordinate representation looks quite tolerable.

To derive the coordinate representation of the Hamiltonian, I need first to resolve a few technical questions. In particular, I need to know how to generate a representation of the product of two operators from representations of individual factors. Consider, for instance, operator expression OQOS, whose integral kernel in some basis j�i is h�0j OQOS j�i. Inserting a completeness relation (again!) between the operators, I obtain

˝ �0 ˇ̌ OQOS j�i D

ˆ d�00

˝ �0 ˇ̌ OQ j�00i h�00j OS j�i D

ˆ d�00Q.�0; �00/S.�00; �/: (5.27)

An important example is operator OP2x , whose position representation would be useful to know. The integral kernel for OPx was found in the previous section as Q .x0; x00/ D i„ı0 .x0 � x00/ , where 0 on the delta-function signifies differentiation with respect to the first argument. Substitution of these expressions into Eq. 5.27 yields

˝ x0 ˇ̌ OP2x jxi D �„2

ˆ dx00

dı .x0 � x00/ dx0

dı .x00 � x/ dx00

D

„2 d 2ı .x0 � x00/

dx0dx00

ˇ̌ ˇ̌ x00Dx

D �„2 d 2ı .x0 � x/

dx02 D �„2 d

2ı .x � x0/ dx2

where in the last line, I used evenness of the delta-function to switch from ı .x0 � x/ to ı .x � x0/ and the chain differentiation rule to change the differentiation variable. If you plug this result into expression

'ˇ.x 0/ D

ˆ ˝ x0 ˇ̌ OP2x jxi'˛.x/dx;

you will get

'ˇ.x 0/ D �„2

ˆ d2ı .x � x0/

dx2 '˛.x/dx D �„2 d

2'˛.x/

dx2 (5.28)

which means that the coordinate representation of OP2x operator is just the square of �i„d=dx operator (could have guessed this, of course, but this derivation was

1You might remember that the lack of spatial-temporal picture was the main complaints Schrödinger leveled against Heisenberg’s “transcendental” algebraic approach.

126 5 Representations of Vectors and Operators

a nice exercise, wasn’t it?). Obviously the same result can be obtained for two

other components of the momentum, which means that operator OP2 in the position representation is given by

Op2 D �„2r2; (5.29)

where r2 D �!r � �!r is the Laplacian operator. Using Eq. 5.19, one can easily derive

r2 D @ 2

@x2 C @

2

@y C @

2

@z2 : (5.30)

Since the action of the position operator in the position representation amounts to the simple multiplication by position vector r, the position representation of the potential energy operator V .Or/ amounts to multiplication by V .r/. Thus the action of the entire Hamiltonian in the position representation can now be described as

OHr‰˛ .r/ � ˆ

dr0 hrj OH ˇ̌r0˛‰˛ � r0 � D

1

2me

ˆ dr0 hrj Op2 ˇ̌r0˛‰˛

� r0 �C

ˆ dr0 hrj OV ˇ̌r0˛‰˛

� r0 � D

� „ 2

2me r2‰˛ .r/C

ˆ dr0V.r0/ hrj r0˛‰˛

� r0 � D

� „ 2

2me r2‰˛ .r/C

ˆ dr0V.r0/ı

� r � r0�‰˛

� r0 � D

� „ 2

2me r2‰˛ .r/C V .r/‰˛ .r/ ; (5.31)

where ‰˛ .r/ stands for hr j˛i. Correspondingly, I can write down the Hamiltonian in the position representation simply as

OHr D � „ 2

2m r2 C V.r; t/ (5.32)

which acts on functions �.r; t/ realizing the position representation of the corre- sponding quantum states.

The time-dependent Schrödinger equation in the coordinate representation is obtained from Eq. 4.9 by premultiplying it with the basis bra vector hrj and using the completeness relation:

i„d hr j˛i dt

D ˆ

dr0 hrj OH ˇ̌r0˛ ˝r0 j˛i :

5.1 Representation in Continuous Basis 127

The left-hand side of this equation is simply �.r; t/ (I will drop the subindex ˛ from now on), while the right-hand side was evaluated just a few lines above in Eq. 5.31. Thus, I can write the position representation of Eq. 4.9 as

i„@�.r; t/ @t

D � „

2

2m r2 C V.r; t/

� �.r; t/: (5.33)

This is what most of quantum mechanics textbooks call the celebrated time- dependent Schrödinger equation governing quantum dynamics of a single-particle quantum state represented by wave function �.r; t/. If the potential function in Eq. 5.33 does not depend on time, one can separate time and coordinate dependence of the wave function as

�.r; t/ D exp

�iE„ t .r/ (5.34)

where .r/ obeys equation

� „

2

2m r2 C V.r/

� .r/ D E .r/ (5.35)

often called time-independent Schrödinger equation. Rewritten in the form OHr .r/ D E .r/, where subindex r points to the position representation, it becomes reminiscent of Eq. 4.11 defining eigenvalues and eigenvectors of the Hamiltonian. Obviously, Eq. 5.35 produces eigenvectors of the Hamiltonian in the position representation.

This equation, which is a linear differential equation of the second order, has to be complemented by boundary conditions specifying behavior of the wave functions at infinity. They depend on the type of spectrum (discrete or continuous) the respective wave functions belong to. If the eigenvalue E belongs to a discrete spectrum, we know from the discussion in Sect. 2.2 that the corresponding states are square- integrable, which means that integral

´ j .r/j2 dr taken over the entire volume (it defines the norm of the state vector in the coordinate representation; see Eq. 2.43 or 5.4) is finite. Only functions which tend to zero fast enough when jrj ! 1 will satisfy this requirement. Thus, the boundary condition for the wave functions of discrete spectrum can be formulated as

lim jrj!1

j .r/j D 0: (5.36)

The existence of a discrete spectrum depends on the behavior of the potential function V.r/ and is closely related to the type of classical motion at a given energy. Imagine, for instance, that there exists a closed surface in space separating regions where E > V.r/ from the regions where E < V.r/. A classical particle can only exist in the latter region, because the former would correspond to negative values of

128 5 Representations of Vectors and Operators

kinetic energies. Regions where classical kinetic energy would be positive are called classically allowed, while regions where kinetic energy turns negative are called classically forbidden. The boundary between these two regions, where E D V.r/, forms a surface, which a classical particle cannot cross. Such motion of a classical particle is called bound motion. In the quantum mechanical case, Schrödinger Eq. 5.35 has solutions in both regions, which, however, have a completely different behavior. An analysis in the most generic three-dimensional case is mathematically too involved to attempt it here, so I shall illustrate this difference considering a one-dimensional model, with the wave function and the potential depending on a single coordinate, e.g., x. For a classically bound motion to take place in this case, there must exist an interval of coordinates x1 < x < x2, where E > V.x/, while everywhere else E < V.x/. The terminal points of this interval are so-called turning points, where a classical particle would momentarily stop before reversing its velocity.

It is convenient to analyze this situation quantum mechanically by rewriting the Schrödinger equation as

d2 .x/

dx2 D 2m„2 ŒV.x/ � E� .x/: (5.37)

In the classically forbidden regions, which extend to infinity in both positive and negative directions of the coordinate axes, the second derivative of the wave function always has the same sign as the wave function itself. It is easier to discuss the meaning of this result assuming that the wave function and, respectively, its second derivative, in the classically forbidden region, are positive. In this case, if the first derivative is positive (wave function grows), it becomes even more positive so that the wave function bends upward growing even faster with increasing x. If, however, the first derivative is negative (wave function decreases), it is becoming less and less negative approaching zero. The wave function in this case must also asymptotically approach zero without ever changing its sign. This wave function would obviously satisfy the boundary condition given in Eq. 5.36 and, therefore, correspond to the eigenvalue from the discrete spectrum. If the wave function is negative, all the same arguments work, and the wave function is either monotonically decreasing, becoming even more negative, or increasing approaching zero from the negative side. In the classically allowed region, the second derivative is negative, and the solution to the equation does not have to be monotonic. The main conclusion following from these arguments is that the energy eigenvalues belonging to an interval corresponding to a classically bound motion form, in quantum description, a discrete spectrum.

Wave functions corresponding to the continuous spectrum of energy usually appear in the situations when the potential approaches a constant finite value at infinity. If energy E exceeds this limiting value of the potential, than asymptotically for large values of x, Eq. 5.37 takes the following form:

d2 .x/

dx2 D �2m„2 ŒE � V.1/� .x/

5.1 Representation in Continuous Basis 129

which has two possible solutions .x/ / exp.ikx/ or .x/ / exp.�ikx/, where k D p2m ŒE � V.1/�=„. Any one of these asymptotic forms can be chosen as a boundary condition at infinity: the actual choice is determined by the physical problem at hand. This situation often appears in so-called scattering problems, when one is interested in the behavior of a stream of particles incident on the potential from infinity and being registered by a detector on the opposite side of the potential. For this reason, wave functions with asymptotic behavior of this kind are called scattering wave functions. I will talk much more about this situation in subsequent chapters of the book.

Finally, you need to learn about the continuity properties of the wave functions. This issue arises only if the potential V.r/ is not everywhere continuous (if the potential is continuous, the wave functions are automatically continuous). We require that the wave function remains continuous regardless of the discontinuity of the potential. The physical foundation for such a requirement can be given as follows. A discontinuity of wave function means that its first derivative becomes infinite at the point of discontinuity, which creates a whole bunch of problems, e.g., the expectation value of the momentum of the particle at this point becomes infinite.

However, the continuity of the first derivative of the wave function is not neces- sarily guaranteed. In one-dimensional case, one can show that if the discontinuities of the potential only occur in the form of finite “jumps,” the first derivative of the wave function remains continuous (provided that the mass of the particle remains the same on both sides of the “step in the potential”). To see this one simply needs to integrate Eq. 5.37 over an infinitesimal interval surrounding the point of discontinuity of the potential, xd:

lim "!0

xdC�ˆ

xd��

d2 .x/

dx2 dx D lim

"!0

d .x/

dx

ˇ̌ ˇ̌ xdC"

� d .x/ dx

ˇ̌ ˇ̌ xd�"

! D

lim "!0

2m

„2 xdC�ˆ

xd�� ŒV.x/ � E� .x/dx )

lim "!0

d .x/

dx

ˇ̌ ˇ̌ xdC"

� d .x/ dx

ˇ̌ ˇ̌ xd�"

! D 2m„2 Œ.V2 � V1/ .xd/" � E .xd/"� D 0 )

d .x/

dx

ˇ̌ ˇ̌ xdC0

D d .x/ dx

ˇ̌ ˇ̌ xd�0

(5.38)

where V1 D V.xd �0/ and V2 D V.xd C0/. In some semiconductor heterostructures (alternating planar layers of different semiconductors), Eq. 5.37 is sometimes used to describe the behavior of charged particles in the so-called effective mass approximation. In this approximation the periodic potential of ions felt by electrons is approximately taken into account by modifying the mass of the electrons from their normal “free electron” value. The new “effective” masses are usually different in different materials, and if the discontinuity of the potential occurs due to an

130 5 Representations of Vectors and Operators

electron passing from one semiconductor to another, its effective mass also changes. Repeating previous derivation taking into account the possibility of discontinuity of the mass, you can derive a generalized derivative continuity condition:

1

m1

d .x/

dx

ˇ̌ ˇ̌ xdC0

D 1 m2

d .x/

dx

ˇ̌ ˇ̌ xd�0

(5.39)

where m1;2 are values of the effective mass on both sides of the potential step. The position representation allows for a useful and conceptually important gen-

eralization of the idea of probability conservation expressed by the normalization condition 2.43. Consider the following quantity:

P.t/ D ˆ

v

j� .r; t/j2 d3r;

which yields a probability that a measurement of the particle’s position will find it within the integration volume v. Computing the time derivative of this quantity and utilizing Schrödinger’s equation, Eq. 5.33, you get

@P

@t D

ˆ

v

� .r; t/

@�� .r; t/ @t

C �� .r; t/ @� .r; t/ @t

� d3r D

1

i„ ˆ

v

� �� .r; t/

� „

2

2m r2 C V.r; t/

� �.r; t/

�� .r; t/ � „

2

2m r2 C V.r; t/

� ��.r; t/

� d3r D

i„ 2m

ˆ

v

˚ �� .r; t/r2�.r; t/ � �.r; t/r2��.r; t/� d3r: (5.40)

To proceed you will need the following vector identity:

�� .r; t/r2�.r; t/ � �.r; t/r2��.r; t/ �

r � ��� .r; t/r�.r; t/ � �.r; t/r��.r; t/�

which is easily proved by working it out from the right to the left. What is important is that the expression on the right has a form of a divergence of a vector so that Eq. 5.40 can be written as

ˆ

v

@

@t j� .r; t/j d3r C

ˆ

v

r � jd3r D 0; (5.41)

5.1 Representation in Continuous Basis 131

where I introduced a vector called probability current density

j D i„ 2me

� � .r; t/r��.r; t/ � ��.r; t/r�.r; t/� : (5.42)

One important property of this quantity is that it vanishes for the wave functions rep- resenting a stationary state if its time-independent part is real. Indeed, substituting

‰ .r; t/ D exp .�iEt=„/ .r/ ;

you can see that the product of time-dependent factors yields unity, and, if .r/ is real, the remaining two terms simply cancel each other. Equation 5.42 can be rewritten in a more illuminating form: introducing a velocity operator

Ov � Op me

D � i„ me

r

it can be presented as

j D 1 2

� ‰� Ov‰ C‰ . Ov‰/�� : (5.43)

If you do not see an immediate usefulness of bringing out the velocity operator in the definition of j (besides a purely aesthetic fact that Eq. 5.43 is more pleasant to the eye), let me point out that it highlights the connection between quantum and classical concepts of the current density. As you may remember from introductory physics course, the current density for any flowing quantity in classical physics can be written down as �v, where � is the density of whatever does the flowing (charge, mass, etc.) and v is the velocity of the flow. This connection becomes even more direct for a free propagating particle with wave function:

‰ .r; t/ D A exp .�iEt=„ C ipr=„/ : Substituting this wave function into Eq. 5.42 or 5.43, you will find for the quan- tum j W

j D jAj2 p=m; which is an exact reproduction of the classical expression if you identify jAj2 with �.

Using Gauss’ theorem (google it, if you do not remember!), I can rewrite Eq. 5.41 as

ˆ

v

@

@t j� .r; t/j2 d3r D �

ˆ

†

j � ndS (5.44)

where n is a unit vector normal to surface† enclosing volume v (directed outward). The right-hand side of Eq. 5.44 has a meaning of a flux (just like electric field flux

132 5 Representations of Vectors and Operators

in electromagnetism) characterizing the “flow” of probability across a boundary encompassing the volume. This equation simply states that the probability “to locate” a particle within a given volume decreases if the probability “flows” outside of the volume and increases if the flow of probability is reversed. In this sense, this equation is the statement of conservation of probability, just like a similar statement in electromagnetism would mean conservation of charge, and in hydrodynamics, conservation of mass. An alternative expression of this statement can be obtained if you drop the volume integration in Eq. 5.41 and introduce probability density � .r; t/ � j� .r; t/j2:

@

@t � .r; t/C r � j D 0: (5.45)

This equation is called probability continuity equation, and it looks very much like any other continuity equation: in the electrodynamic context, � is the charge density, and j is the current density; in hydrodynamics, � is a density of a fluid, and j is the mass flux; in thermodynamics, � is local energy density, and j is energy flux; etc. While in quantum mechanics this equation does not describe the flow of anything material, such as charge or mass, it has very similar empirical significance. Proba- bility current density, for instance, determines such experimentally observable char- acteristics as scattering cross-sections or reflection and transmission coefficients.

5.1.4 Orbital Angular Momentum in Position Representation

5.1.4.1 Operators

When I first introduced angular momentum operators in Sect. 3.3.4, I emphasized that the importance of the angular momentum is derived from the fact that it commutes with the Hamiltonian of a particle in a central field. At that time I did not have the tools to prove this fact as well as to study eigenvectors of the angular momentum operators in any particular detail. Using position representation for these operators, I can eliminate some of those gaps. This representation is generated by substituting Eq. 5.17 for position representation of the momentum operator to Eqs. 3.50–3.52 with additional understanding that the action of the position operator is reduced to mere multiplication by r. This procedure generates the following expressions for the Cartesian components of the angular momentum defined with respect to some coordinate axes:

OLx D �i„y @ @z

C i„z @ @y

(5.46)

OLy D �i„z @ @x

C i„x @ @z

(5.47)

OLz D �i„x @ @y

C i„y @ @x : (5.48)

5.1 Representation in Continuous Basis 133

Fig. 5.1 Spherical coordinate system

x

y

z

(r, θ, φ)

φ

θ

r

These expressions imply that operators OLx;y;z act on wave functions defined in terms of Cartesian coordinates x; y, and z of a position vector r. However, Cartesian coordinates are not the only way to characterize a position of a point in space. Spherical coordinates, for instance, can do the same job, and in some instances, we might want to have operators acting on functions .r; �; '/, where r; � , and ' are radial, polar, and azimuthal spherical coordinates (see Fig. 5.1). To make sure that there is no confusion left, let me reiterate: I am using spherical coordinates to describe position dependence of the wave functions in the coordinate representation, but I keep using Cartesian coordinate system to introduce components of the vector of the angular momentum and respective operators. It is important that the two coordinate systems are mutually dependent: the spherical angles � and ' are defined with respect to the same axes, which are used to define Cartesian components of the angular momentum.

To proceed with my plan, I need to remind you the well-known relations between Cartesian and spherical coordinates:

z D r cos � (5.49) x D r sin � cos' (5.50) y D r sin � sin' (5.51)

and

r D p

x2 C y2 C z2 (5.52)

� D arccos

zp x2 C y2 C z2

! (5.53)

' D arctan �y

x

� : (5.54)

134 5 Representations of Vectors and Operators

To make the transition from the operators defined in space of functions f .x; y; z/ to the operators acting on functions f .r; �; '/, I shall use the regular chain rule for differentiation of the functions of several variables, which in this case takes the following form:

@

@x D @r @x

@

@r C @� @x

@

@� C @' @x

@

@'

@

@y D @r @y

@

@r C @� @y

@

@� C @' @y

@

@'

@

@z D @r @z

@

@r C @� @z

@

@� C @' @z

@

@' :

I will illustrate this transition deriving expression for OLz in the spherical coordinates. According to Eq. 5.48, I need derivative operators @[email protected] and @[email protected] Using Eqs. 5.52– 5.54, as well as Eqs. 5.49–5.51, I get

@r

@x D xp

x2 C y2 C z2 D sin � cos':

To compute derivative @�[email protected], it is more convenient to transform Eq. 5.53 into

cos � D zp x2 C y2 C z2

and differentiate it with respect to x:

� sin � @� @x

D � zx .x2 C y2 C z2/3=2 :

This expression can now be transformed into

@�

@x D r

2 cos � sin � cos'

r3 sin � D cos � cos'

r :

Similarly, starting with tan' D y=x, I find 1

cos2 '

@'

@x D � y

x2 D � sin'

r sin � cos2 ' ) @'

@x D � sin'

r sin � :

Gathering all these results together, I finally have

y @

@x D r sin2 � cos' sin' @

@r C sin � sin' cos � cos' @

@� � sin2 ' @

@' : (5.55)

5.1 Representation in Continuous Basis 135

Now I need to repeat these calculations for [email protected][email protected] contribution to Eq. 5.48:

@r

@y D yp

x2 C y2 C z2 D sin � sin'

� sin � @� @y

D � zy .x2 C y2 C z2/3=2 )

@�

@y D cos � sin'

r

1

cos2 '

@'

@y D 1

x D 1

r sin � cos' ) @'

@y D cos'

r sin �

x @

@y D r sin2 � cos' sin' @

@r C sin � sin' cos � cos' @

@� C cos2 ' @

@' : (5.56)

Finally, combining Eqs. 5.55 and 5.56, I am getting my reward for all this hard work because the derived expression for OLz is so remarkably simple:

OLz D �i„x @ @y

C i„y @ @x

D �i„ @ @' : (5.57)

This result justifies going into all these troubles involved in transitioning to spherical coordinates. One can also derive similar expressions for x- and y-components of the angular momentum, but they are not that pretty:

OLx D i„

sin' @

@� C cot � cos' @

@'

(5.58)

OLy D i„

� cos' @ @�

C cot � sin' @ @'

: (5.59)

The remarkable simplicity of OLz expressed in terms of the derivative with respect to spherical coordinates is the main reason why it became customary to consider

the pair OLz; OL2 as a set of commuting operators, when dealing with the angular momentum. Derivation of the expression for operator OL2 in terms of spherical coordinates is quite straightforward, and while it is excruciatingly tedious, it does lead to a really awesome answer:

OL2 D �„2 1

sin �

@

@�

sin �

@

@�

C 1

sin2 �

@2

@'2

� : (5.60)

However, in order to appreciate its awesomeness, you might have to google “Laplacian operator” unless, of course, you are also awesome and remember how

136 5 Representations of Vectors and Operators

Fig. 5.2 Breaking down a classical momentum in components

pt

ppr

r

q

Y

X

it looks like in spherical coordinates (in Cartesian coordinates it was defined in Eq. 5.30). For your convenience I will present it here:

r2 D 1 r2 @

@r

r2 @

@r

C 1

r2

1

sin �

@

@�

sin �

@

@�

C 1

sin2 �

@2

@'2

� (5.61)

hoping that you notice that the angular part of the Laplacian (expression in square brackets) is identical to �OL2=„2. And this fact is not left without important conse- quences. Recall that the Laplacian operator defines the coordinate representation of the kinetic energy operator OK D �„2r2=2me which now can be written down in spherical coordinates as

OK D � „ 2

2mer2 @

@r

r2 @

@r

C

OL2 2mer2

: (5.62)

This presentation of the kinetic energy makes it plainly obvious that h OK; OL2

i D 0.

Indeed, the radial part of kinetic energy commutes with OL2 because they contain derivatives with respect to different coordinates, and the angular part is simply proportional to OL2, which obviously commutes with itself. To get an even better appreciation of Eq. 5.62, it is interesting to consider a classical kinetic energy rewritten in terms of two mutually perpendicular components of the momentum: p� , which is normal to the particle’s position vector, and pr, which is aligned with it. Taking into account that the momentum is tangential to the particle’s trajectory, you can see (Fig. 5.2) that p� D p sin# , where # is the angle between the vector of momentum and the position vector at a given point. In terms of these two components, the kinetic energy can be presented as

K D p 2 �

2me C p

2 r

2me D p

2 sin2 #

2me C p

2 r

2me :

5.1 Representation in Continuous Basis 137

Now, let me play a bit with the first of these terms multiplying its numerator and denominator by r2:

p2 sin2 #

2me D p

2r2 sin2 #

2mer2 :

I am sure you recognize now that the numerator of this expression is jr�pj2, which is nothing, but the classical angular momentum L D r�p. Thus, the classical kinetic energy can be presented as

K D p 2 r

2me C L

2

2mer2

where the last term “miraculously” reproduces a similar term in quantum mechani- cal Eq. 5.62. Isn’t it true that physics (and math) work in mysterious ways?

Now I can fulfill my promise made in Sect. 3.3.4 and prove that the operators of the angular momentum commute with the Hamiltonian if the particle’s potential energy belongs to the class of central potentials. Actually, Eq. 5.60 makes the proof quite trivial: the angular momentum operators in the position representation contain only derivatives with respect to angular variables, so that if the potential energy V.r/ depends only on the radial coordinate V.r/ (definition of the central potential!), then neither OLz nor OL2 affects V.r/ so that OL2V.r/ .r; �; '/ D V.r/ OL2 .r; �; '/, and the same is obviously true for OLz operator. Since I already showed that the angular momentum commutes with the kinetic energy, the last remark completes the required proof.

The direct consequence of vanishing commutators h OH; OLz

i and

h OH; OL2 i

is that

the common eigenvectors of OL2 and OLz are also eigenvectors of Hamiltonians with a central potential, which makes the task of finding these eigenvectors especially important. And this is what, without further ado, I am going to do now.

5.1.4.2 Eigenvectors

First of all, let me remind you that we are looking for the functions, which represent

common eigenvectors of operators OL2 and OLz. This means that these functions must simultaneously obey both equations:

OLz lm .�; '/ D „m lm .�; '/ (5.63) and

OL2 lm .�; '/ D „2l.l C 1/ lm .�; '/ : (5.64) I begin with operator OLz whose eigenvectors in the coordinate representation are particularly easy to find. First, let me notice that this operator only contains derivatives with respect to ', so that the angular variable � plays here the role of

138 5 Representations of Vectors and Operators

“silent” parameter, a constant, as far as operator OLz is concerned. In formal language it means that dependence of � may appear in function lm .�; '/ only as a factor in front of the “main” function dependent only of ':

lm .�; '/ D Pml .�/ˆm .'/ : (5.65)

Substituting this form into Eq. 5.63, you can see that Pml .�/ indeed behaves as a constant and can be discarded. The resulting equation for the remaining function

�i„@ˆm .'/ @'

D „mˆm .'/

has an obvious solution

ˆm .'/ D 1p 2�

exp .im'/ : (5.66)

Now consider how function ˆm .'/ evolves when the position vector rotates around the axis Z. After one complete rotation, which corresponds to the change of ' by 2� , the position vector returns to the initial position. It would have been weird if the wave function would not return to its initial value as well. In a somewhat more sophisticated language, it means function ˆm .'/ is expected to be periodic in '. This can be only achieved if you allow only for integer values of m: m D 0;˙1;˙2 � � � . This is only half of the eigenvalues of the operator OLz found by algebraic methods in Sect. 3.3.4. The eigenvalues corresponding to half-integer values of m result in the solutions that change its sign upon rotation by 2� and shall be discarded. It does not mean, of course, that half-integer values m have no place in quantum theory; it only means that they cannot correspond to eigenvectors permitted the position representation. The factor 1=

p 2� in Eq. 5.66 ensures that the

wave function ˆm .'/ is normalized with respect to the inner products defined as

hˆm1 j ˆm2i � 2�ˆ

0

ˆ�m1 .'/ˆm2 .'/ d' D 1

2�

2�ˆ

0

exp Œi .m2 � m1/ '� d': (5.67)

It is obvious that with this definition of the inner product, the functions representing the eigenvectors are not only normalized but also orthogonal. The integral in Eq. 5.67 is a part of a surface integral carried out over the surface of a sphere, which in spherical coordinates has the following form:

h 1j 2i D �̂

0

2�ˆ

0

d�d' sin � �1 .�; '/ 2 .�; '/ (5.68)

5.1 Representation in Continuous Basis 139

where d�d' sin � is a spherical area element. The remaining integration over polar angle � defines the inner product for yet unknown functions Pml .�/:

˝ Pm1l1

ˇ̌ Pm2l2

˛ D �̂

0

d� sin � �

Pm1l1 .�/ ��

Pm2l2 .�/ : (5.69)

These functions are found by substituting lm .�; '/ D Pml .�/ exp .im'/ into Eq. 5.64, which results in the following equation:

� 1

sin �

@

@�

sin �

@

@�

C 1

sin2 �

@2

@'2

� Pml .�/ exp .im'/ D l.lC1/P .�/ exp .im'/ :

Carrying out the differentiation with respect to ' and canceling the exponential factor results in the following equation for Pml .�/:

1

sin �

@

@�

sin �

@Pml @�

� m

2

sin2 � Pml C l.l C 1/Pml D 0:

Do you see now why I kept both indexes l and m in the notation for Pml ? By introducing the new variable x D cos � , this equation can be rewritten as

d

dx

� 1 � x2� dP

m l

dx

� C

l.l C 1/ � m 2

1 � x2 �

Pml D 0 (5.70)

where I used relation d=d� D .dx=d�/ d=dx D � sin �d=dx and replaced sin2 � with 1 � cos2 � D 1 � x2.

This equation is very well known in mathematical physics as general Legendre equation, whose solutions can be presented in the form of associated Legendre functions, Pml .x/ � Pml .cos �/. As is clear from the relation between variables x and cos � , functions Pml .x/ are defined on the interval x 2 Œ�1; 1�, where they are orthogonal with the inner product defined as

´ 1 �1 P

m l1 .x/Pml .x/dx:

1ˆ

�1 Pml1 .x/P

m l .x/dx D

2 .l C m/Š .2l C 1/ .l � m/Š ıl;l1 : (5.71)

You may want to notice that the substitution of the integration variable x D cos � converts this integral into the form identical to the integral in Eq. 5.69.

The proof of orthogonality of the Legendre functions is fairly standard for differential equations of this kind, and you will benefit from learning how to carry it out. First, copy Eq. 5.70 for Pml1 :

d

dx

� 1 � x2� dP

m l1

dx

� C

l1.l1 C 1/ � m 2

1 � x2 �

Pml1 D 0: (5.72)

140 5 Representations of Vectors and Operators

Now, multiply Eq. 5.70 by Pml1 and Eq. 5.72 by P m l , and integrate the resulting

expressions from �1 to 1: 1ˆ

�1 Pml1

d

dx

� 1 � x2� dP

m l

dx

� dx C l.l C 1/

1ˆ

�1 Pml1 .x/P

m l .x/dx�

m2 1ˆ

�1

Pml1 .x/P m l .x/

1 � x2 dx D 0

1ˆ

�1 Pml1

d

dx

� 1 � x2� dP

m l1

dx

� dx C l1.l1 C 1/

1ˆ

�1 Pml1 .x/P

m l .x/dx�

m2 1ˆ

�1

Pml1 .x/P m l .x/

1 � x2 dx D 0:

Integration of the first terms in both equations by parts yields

� 1ˆ

�1

� 1 � x2� dP

m l

dx

dPml1 dx

dx C l.l C 1/ 1ˆ

�1 Pml1 .x/P

m l .x/dx

� m2 1ˆ

�1

Pml1 .x/P m l .x/

1 � x2 dx D 0

� 1ˆ

�1

� 1 � x2� dP

m l

dx

dPml1 dx

dx C l1.l1 C 1/ 1ˆ

�1 Pml1 .x/P

m l .x/dx

� m2 1ˆ

�1

Pml1 .x/P m l .x/

1 � x2 dx D 0;

and by subtracting these two expressions, you get

Œl.l C 1/ � l1.l1 C 1/� 1ˆ

�1 Pml1 .x/P

m l .x/dx D 0:

It is quite obvious now that for l ¤ l1 this equality can only hold if 1ˆ

�1 Pml1 .x/P

m l .x/dx D 0:

5.1 Representation in Continuous Basis 141

The derivation of the normalization coefficient in Eq. 5.71 requires a bit more effort, and I shall leave it for the most curious readers to discover it for themselves (google it!). You can also notice that in the case of functions with equal l and different m, the same line of reasoning results in a different orthogonality condition:

1ˆ

�1

Pm1l .x/P m l .x/

1 � x2 dx D

8̂ <̂ ˆ̂:

0 m ¤ m1 .lCm/Š

m.l�m/Š m D m1 ¤ 0 1 m D m1 D 0

(5.73)

where, again, derivation of the normalization integral lies outside the scope of this text.

The associated Legendre polynomials can be computed using the following expression:

Pml .x/ D .�1/m � 1 � x2�m=2 d

lCm

dxlCm � x2 � 1�l (5.74)

where factor .�1/m is known as Condon-Shortley phase and is sometimes excluded from the definition of Pml .x/. Equation 5.74 makes sense and gives non-zero results if and only if l and m are integers and 0 � l C m � 2l , �l � m � l. The integer part of this statement is obvious—derivatives of fractional order are not something that we can live with at this point. The second part of this statement, which reiterates what we have already learned about the relation between these two quantum numbers in Sect. 3.3.4, can be understood by noticing that function� x2 � 1�l is a polynomial of the order 2l and, therefore, can be differentiated no

more than 2l times before it starts producing zeroes. Legendre equation 5.70 is invariant (does not change) if you replace m to �m:

This means that solutions of this equation characterized by m and �m must be proportional to each other. Indeed, one can show that functions defined by Eq. 5.74 satisfy the following important relation:

P�ml .x/ D .�1/m .l � m/Š .l C m/ŠP

m l .x/: (5.75)

Finally, combining Eqs. 5.66 and 5.74 and adding corresponding normalization coefficients, we end up with a set of functions lm � Yml .�; '/ known as spherical harmonics and defined as

Yml .�; '/ D .�1/m s 2l C 1 4�

.l � m/Š

.l C m/ŠP m l .cos �/ e

im': (5.76)

In light of the results presented above, the spherical harmonics are obviously orthogonal and normalized:

142 5 Representations of Vectors and Operators

�̂

0

2�ˆ

0

� Yml .�; '/

�� Ym1l1 .�; '/ sin �d�d' D ıll1ımm1 (5.77)

providing us with the position representation of normalized common eigenvectors of operators OL2 and OLz.

I will conclude this section with a brief description of main qualitative properties of the spherical harmonics. Numerous identities and recursion relations involving associated Legendre functions are well documented and are easily available in the literature and on the Internet. However, it is important to have a qualitative understanding of how spherical harmonics behave off the top of one’s head.

The first thing to notice is the symmetry of the spherical harmonics upon inversion of the position vector on the sphere with respect to the origin of the coordinate system: r ! �r. This corresponds to the transformation of the angular spherical coordinates � ! � � �; ' ! ' C � . Upon this transformation exp .im'/ ! .�1/m exp .im'/, while the argument x D cos � of the associated Legendre function, Pml .cos �/, transforms as cos � ! cos .� � �/ D � cos � , i.e., we are dealing here with inversion x ! �x. Associated Legendre functions have a definite parity: they are either even (do not change) or odd (change the sign) when their argument changes the sign. This is quite obvious from Eq. 5.74: replacing x ! �x does not change the function being differentiated or the factor preceding differentiation, while the derivatives with respect to x change their sign with each differentiation. It is obvious, therefore, that

Pml .�x/ D .�1/lCm Pml .x/: (5.78)

Combining this result with the transformation property of exp .im'/, we have for the spherical harmonics

Yml .� � �; ' C �/ D .�1/l Yml .�; '/ (5.79)

which means that the spherical harmonics have definite parity: they are either even with respect to inversion (for even values of l) or odd, if l is an odd number. This behavior is consistent with the fact that the operator of the orbital angular momentum OL D Or � Op is invariant with respect to the parity transformation, and, therefore, its eigenvectors must also be eigenvectors of the parity operator, i.e., have a definite parity.

It is also important to have a picture of dependence of the spherical harmonics upon its arguments. Dependence on azimuth angle ' is trivial: the real and imaginary parts of the spherical harmonics oscillate with frequency m, but these oscillations are not really significant, unless we are dealing with a superposition state comprised of several spherical harmonics with different azimuthal numbers. For a single spherical harmonics, relevant properties are often described by its absolute values

ˇ̌ Yml .�; '/

ˇ̌2 , which lose all dependence on '. Dependence on polar

5.2 Representations in Discrete Basis 143

angle contained in Pml .cos �/ is a more interesting matter and is determined by values of both quantum numbers l and m separately, as well as by their difference l � m. For instance, for m D l, it is easy to see that

Pll .cos �/ / sinl �;

which takes zero values at � D 0; � (two poles of the sphere) and has a single maximum at the equator � D �=2. The width of the maximum (loosely defined) becomes smaller with increasing l (the function decreases more rapidly away from equator for larger l). If one likes pseudoclassical mind helpers (I would not even call them “analogies”), one can think about a particle rotating around the equator with its angular momentum pointing in the polar direction. The larger the angular momentum is, the more torque would be required to turn it away from the poles, which can be kind of loosely interpreted as a smaller probability for a particle to deviate from the equatorial trajectory. But, please, do not take these pseudoclassical mumbo jumbo too seriously.

The case of m D 0 corresponds to the classical angular momentum lying in the equatorial plane, while respective spherical harmonics are reduced to regular Legendre polynomials:

Pl.x/ D d l

dxl � x2 � 1�l :

These are the only spherical harmonics which do not have zeroes at the poles of the sphere .x D ˙1, or � D 0; �), but it has zeroes between the poles, whose number is equal to the orbital number l. Obviously, the number of minimums and maximums of these functions is always equal to l � 1 (the only exception is l D 0, when we are dealing with a constant). In the case of generic values of m ¤ 0, the spherical harmonics vanish at the poles, and the number of their nods in the polar direction is equal to l � m. In my opinion, mastering the provided information will help you not only to have a qualitative feeling for various expressions and phenomena involving spherical harmonics but also to make quite an impression at a cocktail party. To help you with visualizing these properties, I plotted graphs of the associated Legendre polynomials with l D 3 in Fig. 5.3. To make the picture prettier, I normalized all functions in the plot to bring their maximum values closer to each other; obviously this procedure did not change their qualitative behavior.

5.2 Representations in Discrete Basis

Now let’s talk about the representation of abstract vectors in discrete bases. Equation 3.39, which represents vector j˛i in a basis j�ni, establishes a one- to-one correspondence between the vector and a set of coefficients an. These coefficients are a discrete analog of functions representing vectors in continuous

144 5 Representations of Vectors and Operators

Fig. 5.3 Graphs of associated Legendre polynomials with l D 3 and 0 � m � 3

basis introduced in the previous section and can be arranged in the form of a column vector. Thus, in this case we are representing the abstract vector space by a space of column vectors with all the rules of matrix addition and multiplication defined for these objects. The Hermitian conjugation in this space was discussed in Sect. 2.2.2 and includes transitioning to the adjoint space inhabited by row vectors with complex-conjugated elements:

h˛j D X

n

a�n h�nj :

The inner product now becomes a standard matrix multiplication between a row vector on the left and a column vector on the right:

h˛j ˇi D �a�1 a�2 � � � a�N � � � �

2 6666664

b1 b2 :::

bN :::

3 7777775

D 1X

iD1 a�i bi; (5.80)

where bn are coefficients in the expansion of ket jˇi in the same basis, while the outer or tensor product j˛i hˇj is represented by a matrix formed according to the rules of the matrix tensor product, Eq. 2.16:

S.˛;ˇ/nm D

2 6666664

a1 a2 :::

aN :::

3 7777775

� b�1 b�2 � � � b�N � � �

� D

2 666666664

a1b�1 a1b�2 � � � a1b�N � � � a2b�1

: : : : : : a2b�N

: : : :::

: : : : : :

::: : : :

aNb�1 aNb�2 � � � aNb�N : : :

::: : : :

: : : : : : � � �

3 777777775 : (5.81)

5.2 Representations in Discrete Basis 145

Due to the normalization requirement accepted for the state vectors, the expansion coefficients obey the following obvious “sum” rule:

X n

janj2 D 1 (5.82)

which is, again, a discrete analog of Eq. 5.4. If a space of states of a given quantum system can be fully described by a discrete

basis, these states can always be presented in the form of column and row vectors reducing the problem to that of a matrix algebra (remember Heisenberg’s matrix mechanics—this is where it finds its roots). The main difference, of course, is that in standard linear algebra problems, the dimension of the space is always finite, while normally spaces of quantum mechanical states have infinite dimensionality. This creates a number of technical problems of mathematical nature, but we shall let mathematicians to worry about them. At any rate, in most practical applications of quantum theory, you wouldn’t have to deal with the entire infinitely dimensional space of states. Usually, it is possible to find a way to restrict attention to a much smaller (sometimes just two-dimensional) subspace using certain physically meaningful assumptions about hierarchy of interactions relevant for the problem under study.

To have you started, consider this simplest of simplest example, which, however, often gives students a headache.

Example 16 (A Basis Vector in Its Own Basis) This example deals with the following question: what is a representation of a vector in a basis to which this vector itself belongs? In other words, if j�Qni is one of the set of orthogonal normalized vectors j�ni ; n D 1; 2 � � � , which column vector will represent it in the basis formed by these vectors? Even though the answer to this question is almost trivial, it never fails to confuse students. Assume, for instance, that Qn D 1. In this case I have j�1i D 1 j�1iC0 j�2iC0 j�3iC� � � . Obviously, the corresponding column vector is

2 6664

1

0

0 :::

3 7775 :

Considering Qn D 2, I will similarly find that the column representing this vector contains unity in the second position and zeroes everywhere else. This pattern, of course, repeats itself for all other elements of the basis: any basis vector j�Qni is represented in the basis it is the element of, by a column, where all components but one are zeroes, and the only component in the Qn-th place is unity.

Now, if column vectors can represent vector states, it is almost obvious that operators must be represented by matrices, in which case the word “act” would mean matrix multiplication. Matrices multiply column vectors from the left and row vectors from the right. The main question is how to construct a matrix representing

146 5 Representations of Vectors and Operators

a given operator in a chosen basis. To answer this question, I will again rely on the completeness relation (its discrete basis reincarnation, Eq. 3.42) for the basis vectors j�ni. Insertion of this relation into jˇi D OT j˛i yields

1X nD0

j�ni h�n jˇi D 1X

nD0 j�ni h�nj OT

1X mD0

j�mi h�m j˛i

D 1X

nD0

1X mD0

h�nj OT j�mi h�m j˛i j�ni :

Taking into account that coefficients bn are given by bn D h�n jˇi and coefficients an are an D h�m j˛i, I transform the previous equation into

1X nD0

bn j�ni D 1X

nD0

1X mD0

h�nj OT j�mi am j�ni :

Now, thanks to the linear independence of the basis vectors, I can simply equate the coefficients in front of each of j�ni separately:

bn D 1X

mD0 h�nj OT j�mi am: (5.83)

This expression can be rewritten in the matrix form as

b D T � a

where I am using bold Latin letters to denote columns and matrices representing vectors and operators in a given discrete basis. This means that the required matrix representation of the given operator is

Tnm D h�nj OT j�mi : (5.84)

To illustrate an application of this result, consider the matrix of the operator OS D j˛i hˇj obtained by the outer product of two vectors. Using Eq. 5.84, you immediately find

Smn D h�n j˛i hˇj �mi D amb�n with full agreement with the result obtained using standard matrix definition of the outer product, Eq. 5.81.

The equation for eigenvectors OT j˛i D � j˛i in the matrix representation is reduced to the matrix equation:

5.2 Representations in Discrete Basis 147

1X mD0

Tnmam D �an

which can be rewritten as

1X mD0

.Tnm � �ınm/ am D 0: (5.85)

This is essentially a shortcut notation for the system of uniform linear equa- tions, which has nontrivial (meaning non-zero) solutions, only if the determinant kTnm � �ınmk vanishes (Cramer’s rule that had already been mentioned earlier). If you go back to Sect. 3.2.3, you will find examples of eigenvector and eigenvalue calculations with matrices.

The matrix representation of operators is practically useful only if you know how the operator acts on the vectors of the chosen basis. If an operator in question is built out of basis vectors, the problem is resolved almost trivially. For instance, the projection operators introduced in Eq. 3.41 have a simple matrix representation in the same basis in which it is defined:

P.n/km D h�kj �ni h�nj �mi D ıknınm which is a matrix with a single non-zero element k D m D n on a main diagonal. You can find other examples of matrix representations for operators of this kind, in the exercises in this chapter.

In most cases the issue of finding how operators act on the basis vectors is not that trivial. Often it is resolved by using position representation of operators and the basis vectors. This approach works especially well for the class of operators, which can be presented as a combination of position and momentum operators. This class includes many important operators, but not all of them.

5.2.1 Discrete Representation from a Continuous One

To illustrate this point, let me consider an example of a single particle of mass me allowed to move freely along a linear segment of finite length L. The probability that the particle’s position can be anywhere outside of this segment is assumed to be zero. This condition is most naturally expressed in the position representation, where Hamiltonian, which contains only kinetic energy term, takes the form of

OH D Op 2 x

2m D � „

2

2m

d2

dx2 : (5.86)

148 5 Representations of Vectors and Operators

The confinement of the particle inside the specified linear segment is formally expressed by the requirement that the wave function .x/ representing states of the system is equal to zero outside of the allowed interval. The continuity of the wave function then requires that it also vanishes at the terminal points of this interval. Choosing the origin of a coordinate system at the left end of the allowed interval, and assigning coordinate x D L to its right end, I can express the confinement conditions by requiring that the wave function vanished at both ends of the interval:

.0/ D .L/ D 0: (5.87)

It is easy to check that Schrödinger equation 5.37 for a free particle (V.x/ D 0) has two linearly independent solutions C.x/ D exp .ikx/ and �.x/ D exp .�ikx/, where k D p2mE=„. Now I need to construct a linear combination of these functions obeying the confinement conditions, Eq. 5.87. Beginning with a general solution

.x/ D Aeikx C Be�ikx;

I find that requirement .0/ D 0 yields A C B D 0, which allows me to write the wave function as

.x/ D A sin kx:

(I used Euler’s formula sin kx D .exp.ikx/ � exp.�ikx// =2i and incorporated constant 2i into coefficient A.) The condition at x D L yields

A sin kL D 0

with two possible ways to fulfill it. One is to make A D 0, in which case the entire wave function vanishes, and we definitely do not want this to happen. Thus, you are stuck with the only other option, namely, to require that

kL D �n; n D 1; 2 � � � :

This result means that the states of the system considered in this example can only be presented by a discrete set of wave functions:

n.x/ D A sin knx

characterized by parameter

kn D �n L

(5.88)

5.2 Representations in Discrete Basis 149

with corresponding discrete energy levels

En D „ 2�2n2

2mL2 : (5.89)

The appearance of the discrete spectrum is not surprising here, of course, since the classical motion in this example is clearly bound. The remaining unknown coefficient A remains unknown at this point—it cannot be fixed by the boundary condition, which is a fairly typical situation in problems of this kind. I, however, have one additional weapon at my disposal—the normalization condition, which in this case reads as (using standard definition of the inner product for the square- integrable functions)

1̂

�1 j .x/j2 dx D jAj2

Lˆ

0

sin2 knxdx D 1 ) A D r 2

L

where at the last step, I chose A to be a real positive quantity. This choice while pleasing to the eye does not make any difference since normalization condition only defines A up to an arbitrary phase factor of the form exp .i'/ in alignment with the already mentioned general principle that vectors representing quantum states are always defined only up to a phase.

The system of wave functions

n.x/ D r 2

L sin

�nx

L (5.90)

forms a normalized orthogonal basis, which can be used to present any other wave function defined on the interval x 2 Œ0;L� (one can recognize here just a Fourier series expansion for a function defined on a finite interval). This basis can also be used to represent various operators acting on such functions. For instance, Hamiltonian, Eq. 5.86, in this basis is represented by an infinite diagonal matrix:

Hmn D � „ 2

2m

2

L

Lˆ

0

sin �mx

L

d2

dx2 sin

�nx

L dx D

„2�2n2 2mL2

2

L

Lˆ

0

sin �mx

L sin

�nx

L dx D „

2�2n2

2mL2 ımn: (5.91)

Now, assume that the particle that you follow is also subjected to an external uniform electric field (with all other conditions and limitations intact). This will add a potential energy term to the Hamiltonian of the form V.x/ D eFx, where e is the absolute value of the particle’s charge, presumed to be negative, and F is the

150 5 Representations of Vectors and Operators

magnitude of the field. Now, I want you to try to present the new Hamiltonian of the particle:

OH D Op 2 x

2m C eFx

in the same basis of functions n.x/ defined in Eq. 5.90. The resulting matrix would have the diagonal part given in Eq. 5.91, and the part, which can be written as eFxmn, where

xmn D 2 L

Lˆ

0

x sin �nx

L sin

�mx

L dx D

1

L

Lˆ

0

x

cos

� .n � m/ x L

� cos � .n C m/ x L

� dx D (5.92)

1

L

L

�

x

n � m sin � .n � m/ x

L � x

n C m sin � .n C m/ x

L

�L 0

�

1

L

L

�

2 4 1

n � m

Lˆ

0

sin � .n � m/ x

L dx � 1

n C m

Lˆ

0

sin � .n C m/ x

L dx

3 5 D

L

�2 1

.n � m/2 cos � .n � m/ x

L

ˇ̌ ˇ̌ L

0

� L �2

1

.n C m/2 cos � .n C m/ x

L

ˇ̌ ˇ̌ L

0

D

L

�2 Œ.�1/n�m � 1� 4nm

.n2 � m2/2 ; n ¤ m: (5.93)

The diagonal element of this matrix, which is just an expectation value of the coordinate, is easily found to be (from the first line of Eq. 5.93) xnn D L=2, which has an obvious physical meaning. The total Hamiltonian in the representation based on functions defined in Eq. 5.90 is now an infinite nondiagonal matrix:

Hmn D „2�2n2 2mL2

C eFL 2

ımn C eFL

�2 Œ.�1/n�m � 1� 4nm

.n2 � m2/2 (5.94)

where the second term contributes only to the nondiagonal elements. The electric field-related correction to the diagonal elements of the Hamiltonian is just a constant and can be eliminated by choosing a different zero level for the energies, for instance, by writing the electric field potential as eF.x � L=2/.

This example illustrates a rather general situation: often in order to find a representation of an operator in one basis, we have to use its known representation

5.2 Representations in Discrete Basis 151

in a different basis. This approach works especially well with observables that can be expressed as combinations of position and momentum operators, whose representations in continuous bases were discussed above. This approach often leads to nondiagonal matrices, and finding the eigenvalues and eigenvectors of the operator of interest is reduced to finding eigenvalues and eigenvectors of the resulting matrix. In many cases this cannot be done exactly because the dimensionality of the resulting matrices can be infinite, but it is often possible to truncate them and solve the problem approximately. How this is done practically will be discussed in a separate chapter.

5.2.2 Transition from One Discrete Basis to Another

Quite often you will find yourself in a situation when having found (or being given) matrix representation of an operator in one discrete basis, you will need to find an equivalent matrix representing this operator in a different basis. Here I will show how this can be done.

So, let’s assume that you have an operator OT and a system of basis vectors ˇ̌ ˇ�.old/m

E .

The representation of this operator in this basis, as we have already established, is given by a matrix:

T.old/mn D ˝ �.old/m

ˇ̌ OT ˇ̌�.old/n ˛ :

However, I would like to re-derive this expression in a slightly different way. Let me multiply the operator OT by two unity operators expressed by the completeness relation, Eq. 3.42, formed with the vectors of this basis:

OT D 1X

nD0

1X mD0

ˇ̌ �.old/m

˛ ˝ �.old/m

ˇ̌ OT ˇ̌�.old/n ˛ ˝ �.old/n

ˇ̌ D

1X nD0

1X mD0

T.old/mn ˇ̌ �.old/m

˛ ˝ �.old/n

ˇ̌ : (5.95)

This representation of an operator in terms of a matrix and operators ˇ̌ ˇ�.old/m

E D � .old/ n

ˇ̌ ˇ

is akin to the expansion of a vector into a linear combination of basis vectors. Now,

let me assume that I have another basis ˇ̌ ˇ�.new/m

E , and I want to relate the matrix of

the operator in this basis T.new/mn to matrix T .old/ mn . To achieve this goal, let me express

the matrix T.new/mn using Eq. 5.95:

T.new/kl D 1X

nD0

1X mD0

T.old/mn D � .new/ k

ˇ̌ �.old/m

˛ ˝ �.old/n

ˇ̌ � .new/ l

E :

152 5 Representations of Vectors and Operators

This can be rewritten with the help of two new matrices:

Unl D ˝ �.old/n

ˇ̌ � .new/ l

E (5.96)

and

QUkm D D � .new/ k

ˇ̌ �.old/m

˛

as

T.new/kl D 1X

nD0

1X mD0

QUkmT.old/mn Unl:

(Note the position of the indexes in this expression, which adhere to the regular rule for the matrix multiplication.) Now I need to figure out how to construct matrices U and QU and their relation to each other. Let me first perform complex conjugation of each element of matrix QUkm and take advantage of the main property of the inner product, Eq. 2.19:

QU�km D D � .new/ k

ˇ̌ �.old/m

˛� D ˝�.old/m ˇ̌ ˇ�.new/k

E D Umk:

Thus I can see that matrix QU can be obtained from U by complex conjugation and transposition, or, expressing this in fewer words, QU is a Hermitian conjugate of U: QU D U�, and the matrix transformation rule can be presented as

T.new/kl D 1X

nD0

1X mD0

U�kmT .old/ mn Unl: (5.97)

Now, let me focus on one particular column of matrix U, say, column l0. Then you

can easily recognize that quantities D � .old/ n

ˇ̌ ˇ �.new/l0

E are nothing but coefficients of

expansion of the new basis vector ˇ̌ ˇ�.new/l0

E in the old basis:

ˇ̌ ˇ�.new/l0

E D X

n

ˇ̌ �.old/n

˛ ˝ �.old/n

ˇ̌ � .new/ l0

E ;

which gives a simple recipe for preparing matrix U: find representation of the nth vector of a new basis in the old one and use the corresponding coefficients as a n-th column of matrix U. Let me illustrate this rule with a simple example.

Example 17 (Transformation to a New Basis) Consider a Hermitian matrix:

1 i �i �1

�

and rewrite it in the basis of its own eigenvectors.

5.2 Representations in Discrete Basis 153

Solution

First I need to find these eigenvectors, which are given by equation

1 i �i 1

� a1 a2

� D �

a1 a2

� :

The corresponding eigenvalues are found from

.1 � �/2 � 1 D 0 ) �2�C �2 D 0 ) �1 D 0; �2 D 2:

Now I can find two eigenvectors: For �1 D 0, I have

a1 C ia2 D 0 ) j0i D 1p 2

1

i

�

where I used j0i as a notation for a normalized eigenvector belonging to �1 D 0. You can verify that this vector is indeed normalized.

For �2 D 2, the eigenvector equations become

a1 C ia2 D 2a1 ) j2i D 1p 2

1

�i � :

What you need to realize now (quite obvious, but always gives students a shudder) is that the numbers in these columns are the coefficients in the representation of the new basis (vectors j0i and j2i) in terms of the vectors of the old basis. Thus, the transformation matrix U can be generated as

U D 1p 2

1 1

i �i �

and its Hermitian conjugate matrix as

U� D 1p 2

1 �i 1 i

� :

Plugging these matrices in the transformation rule, Eq. 5.97, I get

1

2

1 �i 1 i

� 1 i �i 1

� 1 1

i �i �

D

154 5 Representations of Vectors and Operators

1

2

1 �i 1 i

� 0 2

0 �2i �

D 0 0

0 2

� ;

which is exactly what you should have expected: a matrix in the basis of its own eigenvectors is diagonal with eigenvalues along the main diagonal.

Sometimes the transformation rule connecting representation of operators in different bases is presented in an alternative form

T.new/kl D 1X

nD0

1X mD0

QUkmT.old/mn QU�nl (5.98)

with matrix QUnl defined as

QUnl D ˝ �.new/n

ˇ̌ � .old/ l

E : (5.99)

Complex conjugation of Eq. 5.99 yields

QU�nl D ˝ �.new/n

ˇ̌ � .old/ l

E� D D � .old/ l

ˇ̌ ˇ �.new/n

˛ D Uln

Performing matrix transposition and recalling that complex conjugation plus trans- position yields Hermitian conjugation, you can see that

QU D U�

and that Eq. 5.98 and Eq.5.97 are equivalent to each other. The transformation matrix in the form of Eq. 5.99 appears naturally when one is

looking for the transformation between components of the same vector written in two different bases. Indeed, consider a vector j˛i represented in two different bases as

j˛i D X

l

a.old/l

ˇ̌ ˇ�.old/l

E D X

l

a.new/l

ˇ̌ ˇ�.new/l

E :

The simplest way to express coefficients a.new/l in terms of coefficients a .old/ l , which

is the goal of this exercise, is to premultiply the expression above by the bra-vectorD � .new/ m

ˇ̌ ˇ and take advantage of the orthogonality of the basis vectors. This yields

a.new/m D X

l

˝ �.new/m

ˇ̌ ˇ�.old/l

E a.old/l D

X l

QUmla.old/l D X

l

U�mla .old/ l :

What is left for me to do now is to show that matrix U defined by Eq. 5.96 is unitary. To this end I need to compute the product of two matrices Unm and U

� ml,

using standard matrix multiplication rule:

5.2 Representations in Discrete Basis 155

� UU�

� nl D

X m

UnmU � ml:

Substituting here Eq. 5.96 I can write

� UU�

� nl D

X m

˝ �.old/n

ˇ̌ �.new/m

˛ ˝ �.new/m

ˇ̌ ˇ�.old/l

E D ˝�.old/n

ˇ̌ � .old/ l

E D ınl;

where I replaced the sum over m with a unity operator because it is again just a completeness condition and used the orthonormalization of the vectors of the basis to replace their inner product with Kronecker’s delta-symbol. This calculation reveals that U� D U�1, which is the definition of the unitary matrix. One should not be surprised that the transformation of the vector components from one basis to another is provided by a unitary matrix. Indeed, such a transformation clearly should not change the norm of the vector, and it is, indeed, one of the important properties of the unitary operators.

5.2.3 Spin Operators

The approach to generating representation of operators outlined in the previous section would not work for operators which cannot be built out of momentum and position. If, however, you somehow know the eigenvalues of the operator in question (most likely this knowledge comes by distilling empirical facts), you can construct the matrix of this operator in the basis of its own eigenvectors. Indeed, if j�mi is the eigenvector of OT , corresponding to eigenvalue tm, i.e., OT j�mi D tm j�mi, then Eq. 5.84 immediately gives

Tnm D h�nj OT j�mi D tm h�nj �mi D tmınm:

Thus, any operator in the basis of its own eigenvectors is presented by a diagonal matrix with eigenvalues along the main diagonal. Unfortunately, we often have to deal with a set of non-commuting operators, only one of which can be presented by a diagonal matrix. The question then remains how to generate a matrix representation of other non-commuting operators in the same basis. Fortunately, in all practical situations, this problem can be solved if one knows commutation relations between relevant operators. I will illustrate this approach by considering representation of angular momentum operators in the situation when eigenvalues of the z-component of the angular momentum can only take two values C„=2 and �„=2: Quantum numbers m and l introduced in Sect. 3.3.4 take in this case values ˙1=2 and 1=2, respectively. In Sect. 5.1.4, you saw that the orbital angular momentum, which is constructed of position and momentum operators, admits only integer values for these numbers. The suggested half-integer values, which are allowed by the algebraic properties of these operators, can, therefore, correspond only to a very

156 5 Representations of Vectors and Operators

special angular momentum of electrons not related to their orbital motion. This intrinsic angular momentum is known as spin. Leaving more detailed discussion of this quantity till later, here let’s just accept its existence and use it to illustrate a method of generating matrix representation of operators, which do not have a position or momentum representation.

To distinguish between spin and orbital angular momentum, I will introduce special notations for the former designating the respective operators as OSx, OSy, and OSz, which have the same meaning as operators OLx, Ly, and Lz of Sect. 3.3.4. Accordingly, I will replace the quantum number l with s and m with ms. It is important to realize from the outset that, while orbital quantum number l is allowed to take any integer values, the value of the respective spin number s is fixed at 1=2 and cannot be changed—it is an intrinsic property of electrons just like its mass or charge. Thus, the only quantum number which can be used to distinguish between different spin states is ms.

Since ms takes on only two distinct values, there exist only two respective states described by eigenvectors of operator OSz. Thus, the space occupied by different spin states is two-dimensional, and the respective vectors are represented by 2�1 column vectors, and operators are represented by 2 � 2 matrices. In the basis of its own eigenvectors, OSz is simply a diagonal matrix:

Sz D „ 2 0

0 � „ 2

� ; (5.100)

while the states take the form of columns

j1=2i D 1

0

� ; j�1=2i D

0

1

� ; (5.101)

where I chose to numerate the state corresponding to the positive eigenvalue as first. (This choice determines the positions of negative and positive elements in the matrix Sz and the ones and zeroes in the corresponding columns.) An arbitrary state in the space of spin states can be written down as a linear combination of the basis vectors:

j�i D a 1

0

� C b

0

1

� : (5.102)

The result expressed by Eq. 5.100 is somewhat obvious, and our main task is to find matrices realizing a representation of two remaining components of the spin angular momentum, OSx and OSy in this basis. (Operator OS2 in this instance is trivial— it is diagonal with identical diagonal elements equal to „2s.s C 1/ D 3„2=4; so it is proportional to an identity matrix.)

I begin solving this problem by focusing on operators OS˙ D OSx ˙ iOSy, which are spin analogs of the ladder operators OL˙ introduced in Eqs. 3.64 and 3.65. Since I postulated that spin operators obey the same commutation relations as operators

5.2 Representations in Discrete Basis 157

OLx;y;z, I can use the results obtained for these operators, in particular Eq. 3.75 describing how operator OLC acts on eigenvectors of OLz. Adapting this equation to the case of spin states, I can write

OSC js;msi D „ r 3

4 � ms .ms C 1/ js;ms C 1i (5.103)

where I took into account that s D 1=2. Applying this equation to the only two existing states j1=2i and j�1=2i (I dropped quantum number s, because it never changes), I have

OSC j1=2i D 0 OSC j�1=2i D „ j1=2i (5.104)

from which you can immediately infer that the matrix representation of OSC is

SC D „ 0 1

0 0

� : (5.105)

The matrix representation for the lowering operator OS� can be derived in a similar way using Eq. 3.76, which yields

OS� j1=2i D „ j�1=2i OS� j�1=2i D 0; (5.106)

but it is much faster simply to recall that OS� D OS�C, so that the respective matrix is obtained by matrix transposition and complex conjugation of Eq. 5.105:

S� D „ 0 0

1 0

� : (5.107)

Now, using the definition of the ladder operators, you can write for OSx and OSy:

OSx D 1 2

�OSC C OS� �

(5.108)

OSy D 1 2i

�OSC � OS� �

(5.109)

which together with Eqs. 5.105 and 5.107 generate the required matrices:

Sx D „ 2

0 1

1 0

� (5.110)

158 5 Representations of Vectors and Operators

Sy D „ 2

0 �i i 0

� : (5.111)

Equations 5.110 and 5.111 provide the solution to the problem of finding the matrix representation for operators, which cannot be reduced to combinations of the position and momentum. As you can see, the commutation relations played the crucial role in solving this problem.

5.3 Problems

Section 5.1.1

Problem 54 Reproduce calculations leading to Eq. 5.16 for the momentum repre- sentation of the coordinate.

Problem 55 Derive Eq. 5.20 generalizing the approach that led to Eq. 5.15 in Sect. 5.1.1.

Problem 56 Derive Eq. 5.24 using the same method which I used deriving Eq. 5.23 (do not attempt to simply invert the previous equation).

Problem 57 Assuming that function �s.q/ presenting eigenvectors of operator OS in the basis of operator OQ is given by

�s.q/ D Aeisq�s2q2 ;

find the integral representation of the operator OS in this basis.

Section 5.1.2

Problem 58

1. Prove that the momentum operator is “odd” (changes its sign upon the parity transformation).

2. Prove that the operator of the angular momentum is invariant with respect to the parity transformation (“even”).

Section 5.1.3

Problem 59 Which of the following can be used as wave functions describing states of the discrete spectrum:

5.3 Problems 159

1. ex 2=2

2. � 2x2 � x4=3� e�x2=2

3. A sin kx 4. B exp.ikx/C C exp.�ikx/ 5. xe�jxj 6. Ae�x2 cos kx

Problem 60 It is known that a potential energy of a quantum particle exhibits a finite discontinuity at point x D 0. It is also known that for x < 0 and x > 0, the wave functions of the particle are presented by

.x/ D (

A � x2 C 2� exp ��˛1x2

� x < 0

B exp ��˛2x2

� x > 0:

Using continuity of the wave function and its derivative, establish relations between parameters A, B, ˛1, and ˛2.

Problem 61 Prove the following identity:

�� .r; t/r2�.r; t/ � �.r; t/r2��.r; t/ � r � ��� .r; t/r�.r; t/ � �.r; t/r��.r; t/� :

Problem 62 Compute probability current densities for a particle in the states described by the following wave functions:

1. .x/ D A exp .ikz/C B exp .�ikz/ 2. .x/ D A cos kx 3. .r/ D Ar exp .ikr/C Br exp .�ikr/, where r D

p x2 C y2 C z2

4. .r/ D A exp .i k� r/C C exp .�i k� r/

Section 5.1.4

Problem 63 Derive expressions for operators OLx and OLy in spherical coordinates presented in Eqs. 5.58 and 5.59.

Problem 64 Derive the orthogonalization condition for the associated Legendre functions with l1 D l2 and m1 ¤ m2 (Eq. 5.73). Do not attempt to obtain the normalization coefficient.

Problem 65 Consider a function of polar and azimuthal angles � and ' defined as

.�; '/ D sin � .1 � cos �/ cos':

1. Normalize this function. 2. Present this function as a linear combination of spherical harmonics Yml .�; '/.

160 5 Representations of Vectors and Operators

3. If the observables presented by operators OL2 and OLz are measured when a particle is in the state presented by this wave function, what would be the possible outcomes and their probabilities?

4. Find the expectation values and uncertainties of these observables in this state.

Problem 66 Repeat the previous problem for a following function:

.�; '/ D 3 2

sin 2� exp .�i'/C 2 sin2 � sin 2';

but do not attempt to normalize it before rewriting it as a combination of spherical harmonics.

Problem 67 Find the coordinate representation for lowering and raising ladder operators introduced in Sect. 3.3.4, and using found expressions, find Yll .�; '/ and Y�ll .�; '/.

Problem 68 Find all zeroes of the angular probability distribution for a particle in angular states described by spherical harmonics Y03 .�; '/, Y

1 3 .�; '/, Y

2 3 .�; '/, and

Y33 .�; '/.

Problem 69 Find energy values for the system described by Hamiltonian:

H D OL2x C OL2y 2I1

C OL2z 2I2 :

Section 5.2

Problem 70 Using eigenvectors j1i and j2i of matrix 0 i �i 0

�

as a basis, construct the matrix representation of operators j1i h1j and j2i h2j and verify the closure (or completeness) condition:

j1i h1j C j2i h2j D OI

where OI is the unity operator. Problem 71 Find the matrix of the operator:

OA D j 1i h 1j C j 2i h 2j C j 3i h 3j � i j 1i h 2j � j 1i h 3j C i j 2i h 1j � j 3i h 1j

in the basis formed by orthonormalized vectors j 1i ; j 2i, and j 3i.

5.3 Problems 161

Problem 72 Write down expressions for spherical harmonics with orbital quantum number l D 1. You can consider them as a basis in the subspace of eigenvectors of operator OL2 belonging to this eigenvalue in the sense that any linear combination of them will also be an eigenvector of this operator.

1. Prove that this is indeed the case, i.e., that any linear combination of spherical

harmonics with l D 1 represents an eigenvector of OL2 with the same eigenvalue. 2. Find the matrix representation of operator OLx in this basis, and using the obtained

matrix, find the representation of this operator’s eigenvectors in this basis. 3. The found eigenvectors can also be considered as yet another basis. Find the

representation of operators OLx and OLz in this basis. Problem 73 Present operator id=dx in the basis of spherical harmonics with l D 1. Problem 74 Consider the matrix:

A D 2 4 0 0 �1 0 1 0

�1 0 0

3 5 :

Transform this matrix to the basis of its eigenvectors. Verify that the elements along the diagonal of the resulting matrix are eigenvalues of A.

Problem 75 Consider two matrices:

A D 2 4 1 i 1 �i 0 0 1 0 0

3 5

and

B D 2 4 3 0 0

0 1 i 0 �i 0

3 5 :

Rewrite matrix B in the basis formed by eigenvectors of matrix A.

Section 5.2.3

Problem 76 Using the same approach, which was used in Sect. 5.2.3 for spin 1=2, find the matrix representation for the operators of the spin s D 3=2. Hint: What is the dimension of the space that contains the vectors representing states of this spin?

Part II Quantum Models

In this part of the book, we will play with some of the toys, which physicists created in order to get better insight into a variety of new and unusual properties exhibited by real systems obeying laws of quantum mechanics. These toys rarely represent real systems and this is why we call them models. Still, in many instances they provide necessary first experience and conceptual understanding required to deal with reality in all its complexity. Using models allows us to focus on those properties of real world, which appear to be of the most significance at least for the class of problems we are interested in. One can think of models in quantum mechanics as of impressionist or postimpressionist paintings, when instead of painstaking attention to details, the main focus is on capturing “the essence” of the object, whatever this might mean. Quantum mechanical models develop physical intuition about the phenomena under study and can often be used as a first iteration of an approximation scheme yielding more accurate and quantitative description of nature.

Chapter 6 One-Dimensional Models

One-dimensional models might appear in quantum mechanics in two, in a way, diametrically opposite situations. In one case, you can pretend that the potential energy of a particle changes only in one direction, such as a potential energy of a uniform electric field. Classically, this would mean a motion characterized by acceleration in one direction and constant velocity in perpendicular directions. By choosing an appropriate inertial coordinate system, you can always eliminate the constant velocity component and consider this motion as straightlinear. Quantum mechanically, this situation has to be described in the coordinate representation, and the respective coordinate wave function can be presented in the form

.r/ D ei.kxxCkyy/'.z/ (6.1)

where Z-axis of the coordinate system is arbitrarily chosen to lie along the direction, in which the potential energy changes. The behavior of the wave function in two perpendicular directions (X and Y) is that of a free particle with conserving components of momentums being px D „kx and py D „ky. Substituting this expression into Eq. 5.35, where the potential energy is taken to have the form of V .r/ � V.z/, and canceling the exponential factors on both sides of the equation, you will end up with the following one-dimensional equation:

� „ 2

2me

d2'

dz2 C V.z/' .z/ D Ez' .z/ (6.2)

where

Ez D E � p 2 x

2me � p

2 y

2me

© Springer International Publishing AG, part of Springer Nature 2018 L.I. Deych, Advanced Undergraduate Quantum Mechanics, https://doi.org/10.1007/978-3-319-71550-6_6

165

166 6 One-Dimensional Models

Fig. 6.1 A schematic of a semiconductor heterostructure, in which the motion of electrons in the direction perpendicular to the planes of the layers can be described by the one-dimensional model

GaAIAs

GaAs

GaAs

GaAs

GaAIAs

GaAIAs

GaAIAs

is the contribution of the motion in z direction to the total energy of the system E. Values of px and py are determined by the initial state of the system, which may or may not be one of the eigenvectors of the Hamiltonian with given values of px and py. In the latter case, the particular solution of the time-dependent Schrödinger equation satisfying the initial conditions will be given by the linear combination of the functions presented in Eqs. 6.1 and 6.2, but for now I will focus only on the stationary states, which correspond to initial conditions with definite px and py.

While for a long time this type of one-dimensional model was used mostly in classrooms to illustrate basic quantum effects to unsuspecting students, the technological advances of the last 50 years made this model quite relevant as a stepping stone to understanding properties of practically important artificial structures made of planar layers of several different semiconductors arranged in an alternating order (see Fig. 6.1). It can be shown (way above your pay grade though) that the motion of electrons in such structures can be approximately described by a potential energy, which only changes in the direction perpendicular to the plane of the layers (growth direction).

The second situation, in which the one-dimensional model can have at least some relation to reality, is the case of potentials confining the motion of the particle in all directions but one. One can imagine a particle moving inside of a cylindrical tube with impenetrable walls. The motion perpendicular to the axis of the cylinder is characterized by discrete allowed values of energy (I will show it later, for now you will have to trust me on that), and if the radius of the tube is small enough, the distance between adjacent energy levels can be sufficiently large that for all practical purposes only one of this energy levels can be taken into account. In this case, the transverse (as in perpendicular to the axes of the cylinder) motion is completely “frozen,” and one is again left with pure one-dimensional motion. In all cases, we are dealing with the Schrödinger equation in the form of Eq. 6.2, which is the main object of study in this chapter.

6.1 Free Particle and the Wave Packets 167

6.1 Free Particle and the Wave Packets

Before taking on quantum states of electrons in one-dimensional piecewise poten- tials such as wells or barriers, it is useful to consider the simplest quantum mechanical model—a freely propagating, i.e., not interacting with anything, parti- cle. In classical physics, as we all know, such a particle would move with a constant velocity, v, and can be characterized by conserving kinetic energy K D mv2=2 and momentum p D mv. In quantum mechanics, states of a free particle are the solution of the Schrödinger equation with zero potential

i„@ j‰i @t

D OP2 2me

j‰i : (6.3)

It is quite easy to see that the stationary states of a free particle are eigenvectors of the momentum operator:

j‰i D exp

�iEp„ t

jpi (6.4)

where jpi is defined by OP jpi D p jpi. Substitution of Eq. 6.4 into Eq. 6.3 yields

Ep D p 2

2me (6.5)

which is an expected classical relation between energy and momentum of a free particle, often called dispersion relation. Historically, Schrödinger equation was devised to make sure that the quantum theory respects this relation between energy and momentum. Indeed, in the position representation, the Schrödinger equation becomes

i„@‰ .r; t/ @t

D �„ 2r2 2me

‰ .r; t/ (6.6)

with stationary state solutions of the form

‰ .r; t/ D 1 .2�„/3=2 exp

�iEp„ t C i

p � r „

(6.7)

where I used ı-function normalized eigenvectors of momentum operator given in Eq. 5.21. Now, one can argue that the Schrödinger equation contains the first-order time derivative because the dispersion relation, Eq. 6.5, is linear in E, while the derivative over coordinates must be of the second order to reproduce the term p2 in Eq. 6.5. Further, one can argue that since the time derivative in the Schrödinger

168 6 One-Dimensional Models

equation is only of the first order, the corresponding wave function must be represented by a complex exponential function rather than by a real trigonometric function, which, in turn, makes the factor i in front of the time derivative necessary to compensate for the similar factor in the argument of the wave function.

The wave function of the form given in Eq. 6.7 has been conceived at the early days of quantum mechanics as a mean to reconcile particle and wavelike properties of quantum objects. However, it was clear from the very beginning that regardless of the chosen interpretation (statistical due to Born or Schrödinger’s pilot wave), there are several problems with assigning this function to represent quantum states of real particles. First, its absolute value is uniform in space, which can hardly represent an actual localized particle regardless of the chosen interpretation. Also, the motion of the wave represented by Eq. 6.7 is characterized by phase velocity vph D !=k D E=p D p=.2me/, which is half of the corresponding classical velocity vcl D p=me making it difficult to associate it with the motion of a particle.

To get around this conundrum, it was suggested that actual states of the particles (in either interpretation) are presented not by stationary states but by their superposition, which still will solve the Schrödinger equation 6.6. It is quite easy to show that by choosing an appropriate superposition, it is possible, for instance, to localize a particle within an arbitrarily small region solving at least one of the listed problems. To see how this comes about, consider a wave function at time t D 0 and form a superposition of the form

.r/ D 1 .2�„/3=2

ˆ d3pA.p/ exp

� i p � r „ � : (6.8)

In Sect. 5.1.1, it was shown that Eq. 6.8 can be inverted to yield

A .p/ D 1 .2�„/3=2

ˆ d3r .r/ exp

� �ip � r„

� (6.9)

so that by choosing an appropriate A .p/, I can “generate” an initial (t D 0) wave function with an arbitrary degree of localization. Now all what I need is to consider the time dependence of this initial superposition to see if other problems outlined above can also be circumvented by the superposition states, which, in the case of free propagating particles, are often called wave packets.

Here I will focus on just one particular example of the wave packets, which despite its relative simplicity will help me to illustrate most of the relevant ideas. First of all, I will simplify the consideration by limiting it to the case of one- dimensional motion described by a wave function, which depends on a single coordinate, say, z. Integrals in Eqs. 6.8 and 6.9 are in this case reduced to one- dimensional form

.z/ D 1p 2�„

1̂

�1 dpzA. pz/ exp

� i pzz

„ �

(6.10)

6.1 Free Particle and the Wave Packets 169

A. pz/ D 1p 2�„

1̂

�1 dz .z/ exp

� �i pzz„

� : (6.11)

Next, I will assume that the initial state of the particle is described by function

.z/ D C exp "

� .z � Nz/ 2

.24z0/2 #

exp

i Npz „ z

(6.12)

where constant C is found from the normalization condition

1̂

�1 j .z/j2 dz D jCj2

1̂

�1 exp

" � .z � Nz/

2

2 .4z0/2 #

dz D

jCj2 p24z0 1̂

�1 exp

��x2� dx D jCj2 4z0 p 2� D 1 H)

C D 1p 4z0

p 2� :

In the course of computing the normalization integral, I introduced a new

integration variable x D .z � Nz/ = �p

24z0 �

and used a well-known integral ´ 1

�1 exp ��x2� dx D p� . Thus, my initial state is represented by the normalized

wave function, where the amplitude of the plane wave exp .iNpzz=„/ is modulated by the so-called Gaussian function

.z/ D 1p 4zp2�

exp

" � .z � Nz/

2

.24z0/2 #

exp

i Npz „ z : (6.13)

The probability distribution corresponding to this wave function is peaked at z D Nz and falls off from its maximum value as z moves away from Nz. Parameter 4z0 determines how fast the decrease of the probability takes place: the larger 4z0, the larger deviation from Nz is required to decrease the probability density by e. Varying 4z0, one can control the degree of particle localization—a smaller 4z0 corresponds to better localized particles (see Fig. 6.2). Formally speaking, one can define Nz and 4z0 as expectation value and uncertainty of the coordinate in the state described by this wave function. Indeed, I can easily compute

170 6 One-Dimensional Models

Fig. 6.2 Normalized Gaussian wave functions with different values of the width parameter 4z0: with decreasing 4z0 the function narrows, while its maximum grows such that the total area under the curve remains equal to unity

hzi D 14zp2�

1̂

�1 dzz exp

" � .z � Nz/

2

2 .4z0/2 #

D

1p �

1̂

�1

� x p 24z0 C Nz

� exp

��x2� dx D Nz

where I took into account that normalization integral computed earlier is equal to unity and the fact that the integral of an odd function over a symmetric interval is zero. The uncertainty takes a bit more work:

˝ z2 ˛ D 14zp2�

1̂

�1 dzz2 exp

" � .z � Nz/

2

.4z0/2 #

D

1p �

1̂

�1

� x p 24z0 C Nz

�2 exp

��x2� dx D

Nz2 C 2 .4z0/ 2

p �

1̂

�1 x2 exp

��x2� dx D Nz2 C .4z0/2

where I used another well-known integral ´ 1

�1 x 2 exp

��x2� dx D p�=2. Subtract- ing Nz2 from ˝z2˛, you can convince yourself that 4z0 is, indeed, the uncertainty of the coordinate.

It shall be noticed that these arguments do not contradict to Schrödinger’s pilot wave interpretation, according to which the wave presented by the wave packet is a real material object accompanying a particle and whose width defines the degree of particle localization.

Now I can find the appropriate amplitudes A.pz/ in Eq. 6.10, which would reproduce the wave function given by Eq. 6.13. Substitution of this equation into Eq. 6.11 yields

6.1 Free Particle and the Wave Packets 171

A. pz/ D 1p 2�„

1p 4z0

p 2�

1̂

�1 dz exp

" � .z � Nz/

2

.24z0/2 #

exp

�i . pz � Npz/ z„

D

24z0p 2�„

p 4z0

p 2�

1̂

�1 dx exp

�x2 � i . pz � Npz/„ .2x4z0 C Nz/

� D

p 24z0p

�„ .2�/1=4 exp

�i pz � Npz„ Nz 1̂

�1 dx exp

�x2 � i24z0 pz � Npz„ x

� D

p 24z0p

�„ .2�/1=4 exp

�i pz � Npz„ Nz

� 1̂

�1 dx exp

" �x2 � i24z0 pz � Npz„ x �

i . pz � Npz/4z0

„ 2

C

i . pz � Npz/4z0

„ 2#

D p 24z0p

�„ .2�/1=4 exp

�i pz � Npz„ Nz

�

exp

" � . pz � Npz/

2 .4z0/2 „2

# 1̂

�1 dx exp

" �

x C i . pz � Npz/4z0„ 2#

D

p 24z0p„ .2�/1=4 exp

�i pz � Npz„ Nz

exp

" � . pz � Npz/

2 .4z0/2 „2

# :

This was a long calculation, but it is worth the efforts to carefully peruse it. Some of the tricks that I used in its course were the substitution of variable x D .z � Nz/ =.24z0/, presenting an expression of the form a2 C 2ba as a complete square, a2 C 2ba C b2 � b2 D .a C b/2 � b2, and finally the fact that integral ´ 1

�1 dx exp h � .x � x0/2

i still equals to

p � regardless of the value of x0: Before

continuing, I will set Nz D 0, which amounts to the choice of the zero of the coordinate z, and introduce new parameter 4p D „= .24z0/. Then the expression for A.pz/ becomes

A. pz/ D 1p4p .2�/1=4 exp "

� . pz � Npz/ 2

.24p/2 # ; (6.14)

and it is easy to verify (do it!) that as expected ´ 1

�1 jA.p/j2 dp D 1, while h pzi D Npz; and parameter 4p determines the uncertainty of the particle’s momentum. Recalling the definition of this parameter in terms of the uncertainty of coordinates, you can see that these two parameters obey the minimum version of the Schrödinger uncertainty principle:

172 6 One-Dimensional Models

4p4z0 D „ 2 : (6.15)

This is the special property of the Gaussian distribution: for all other initial states, the product of the uncertainties would be larger than „=2.

Having found A.pz/, I can now find the time dependence of the initial wave function by considering the superposition of the stationary states at an arbitrary time t:

‰.z; t/ D 1p 2�„4p .2�/1=4

1̂

�1 dpz exp

" � . pz � Npz/

2

.24p/2 #

exp

" �i p

2 z

2„me t C i pz „ z # ;

which at t D 0 is obviously reduced to the function given in Eq. 6.12. I begin evaluating this integral, again, by introducing a dimensionless variable

x D pz � Npz 24p

and transforming this integral into

‰.z; t/ D

24pp 2�„4p .2�/1=4

1ˆ

�1

dx exp ��x2� exp

�it 1

2„me .Npz C 24px/ 2 C i z„ .Npz C 24px/

�

D 2 p4pp

2�„ .2�/1=4 exp

�it Np 2 z

2„me C i z

„ Npz !

�

1ˆ

�1

dx exp

" �x2 � it2Npz4p„me x � it

2 .4p/2 „me x

2 C 2i z4p„ x #

D

2 p4pp

2�„ .2�/1=4 exp

�it Np 2 z

2„me C i z

„ Npz !

�

1ˆ

�1

dx exp

" �x2

1C it2 .4p/

2

„me

! � 2ix4p„

Npz me

t � z #

:

Before continuing, let me brush up this expression a bit, first, by replacing Npz=m, which corresponds to the classical velocity of the particle with momentum Npz with vgr, and second by introducing the notation

6.1 Free Particle and the Wave Packets 173

˛ D s 1C it2 .4p/

2

„me D s 1C it „

2me .4z0/2 (6.16)

where in the second expression I replaced 4p with 4z0 using Eq. 6.15. As a result, the expression for the wave function now takes a somewhat less cumbersome form:

‰.z; t/ D 2 p4pp

2�„ .2�/1=4 exp

�it Np 2 z

2„me C i z

„ Npz !

�

1̂

�1 dx exp

�x2˛2 � 2ix4p„

� vgrt � z

�� :

Performing integral over x (using all the same tricks as before, substitution Qx D ˛x and completion of the square), I will obtain the final expression for the wave function ‰.z; t/

‰.z; t/ D p 24pp„ .2�/1=4 ˛ exp

�it Np

2 z

2„me C i z

„ Npz !

exp

" � 4p ˛„

2 � vgrt � z

�2 #

D

1

˛ p4z0 .2�/1=4

exp

" � � vgrt � z

�2 .2˛4z0/2

# exp

�it Np

2 z

2„me C i z

„ Npz !

(6.17)

where at the last step I used Eq. 6.15 to replace 4p with 4z0. Not surprisingly, at t D 0 Eq. 6.17 is reduced to .z/ as given by Eq. 6.13.

It is quite educational to inspect various factors in this expression separately. The last factor is a regular plane wave with a wave number determined by the expectation value of the momentum Npz and corresponding frequency N! D Np2z= .2„me/. This wave propagates with standard phase velocity vph D „ N!=Npz, but its amplitude, defined by the second exponential factor, is also time and coordinate dependent. For any given instant t, there is coordinate zmax D vgrt, when the amplitude is the largest, and decreases when z deviates away from it in any direction. One can say that the amplitude factor modulates the initial plane wave turning it into a wave packet more or less localized within a finite coordinate region. This localization region obviously changes its position with time as is evident from the definition of zmax, and this motion of the localization region occurs with velocity vgr D zmax=t. This velocity is called group velocity of the wave packet because it characterizes the motion of the entire group of waves participating in the superposition forming the packet, while the phase velocity describes the motion of each separate wave component of this superposition. These two velocities are different because the phase velocity vph D p=2me depends on the momentum p and is, therefore, different for each member of the group. To illustrate all these points, I plotted the real part of the wave function

174 6 One-Dimensional Models

Fig. 6.3 The real part of the wave function representing a wave packet as a function of the coordinate for two different instances. You can see suppression of oscillations away from the main maximum as well as the displacement of the main maximum. The time interval is chosen to be equal to the period of the oscillating factor so that the magnitude of the main maximums remains the same for both instances. For other time intervals, it does not have to be the case because the decrease of the cos function can damp the maximum’s magnitude. Also, this plot does not account for the fact that the parameter ˛ in Eq. 6.17 is complex-valued and depends on time. See discussion of the role of this parameter further in the text

presented in Eq. 6.17 for two distinct time instances as shown in Fig. 6.3. It is easy to see that the expression for the group velocity vgr D Np=me can be obtained from the dispersion relation E.p/ D p2z=2me of the free particle as

vgr D dE dpz

ˇ̌ ˇ̌ pzDNp

: (6.18)

Equation 6.18 can also be generalized to the case of three-dimensional propagation, in which case the derivative is replaced by a gradient vgr D rE.p/jpDNp and also to more exotic cases of particles whose dispersion relation is different from Eq. 6.5. If you are wondering where on earth you can find free particles with dispersion different from standard quadratic form, here are two examples for you: (1) relation between energy and momentum of relativistic particles is E D pm2ec4 C p2c2 and (2) electrons in semiconductors can be in many practically important cases approximated as “free” particles with modified dispersion relation E.p/. In general, the picture of a wave packet propagating with the group velocity can be generalized to non-Gaussian wave packets as long as they can be described by a momentum wave function A.pz/ with a single and relatively narrow maximum.

So far, the behavior of the wave packet appears to be consistent not only with traditional Copenhagen interpretation of quantum mechanics but also with Schrödinger’s pilot wave picture. However, when discussing the role of the ampli- tude modulating factor in Eq. 6.17, I so far ignored an obvious “elephant in the room,” which makes this discussion somewhat more nuanced. The parameter ˛, which sits “quietly” in the denominator of the modulating factor (as well as in a normalization pre-factor), is complex-valued and time dependent. The first of these circumstances makes the expression

6.1 Free Particle and the Wave Packets 175

B.z; t/ D 1 ˛

p4z0 .2�/1=4 exp

" � � vgrt � z

�2 .2˛4z0/2

# (6.19)

not quite a pure amplitude because it also has a phase attached to it. This phase, however, is of little interest, and in order to focus on the actual amplitude part of this expression, I will consider its squared absolute value, jB.z; t/j2, which, of course, coincides with j‰.z; t/j2 and in the Copenhagen interpretation yields the probability distribution P.z; t/ for the coordinates of the particle at any given time in the state described by the wave packet ‰.z; t/:

P.z; t/ D 1p 2�4z0 j˛j2

exp

" � � vgrt � z

�2 .24z0/2

1

˛2 C 1 .˛2/

� #

: (6.20)

Now I define

1

4z 2

D 1

4z0 2

1

˛2 C 1 .˛2/

�

and using the definition of ˛ from Eq. 6.16 calculate

1

4z 2

D 1 2 .4z0/2

0 @ 1 1C it „

2me.4z0/2 C 1 1 � it „

2me.4z0/2

1 A D

1

2 .4z0/2 2

1C t2 „2 4m2e .4z0/4

D .4z0/ 2

.4z0/4 C „2t24m2e : (6.21)

I can also find

1

j˛j2 D 1r�

1C it „ 2me.4z0/2

� � 1 � it „

2me.4z0/2 � D

1q 1C t2 „2

4m2e .4z0/4 D 4z04z : (6.22)

Substitution of Eqs. 6.21 and 6.22 to Eq. 6.20 converts the expression for the probability density in the following nice looking form:

P.z; t/ D 1p 2�4z exp

" � � vgrt � z

�2 2 .4z/2

# : (6.23)

176 6 One-Dimensional Models

Comparing this result with Eq. 6.13, it becomes clear that zmax D vgt is the expectation value of the coordinate in the state described by the wave packet, while 4z represents its uncertainty. Rewriting Eq. 6.21, I can present this uncertainty in a more illuminating form:

4z D 4z0 s 1C „

2t2

4m2e .4z0/4 (6.24)

which shows that the localization range of the wave packet increases with time with the rate (roughly defined as derivative d .4z/2 =dt2) inversely proportional to the initial uncertainty 4z0. In other words, the tighter you try to squeeze your particle into a smaller volume, the faster the localization volume of the particle increases with time. This phenomenon of the wave packet spreading is what kills Schrödinger’s pilot wave interpretation: the broadening of the wave packets would make such a pilot wave unstable. To get an intuitive feeling for how fast this spreading takes place, assume that an electron is initially localized in a region of atomic dimensions with 4z0 ' 10�10 m. Substituting the values of the Planck constant and the electron’s mass into Eq. 6.24, you will get for 4z:

4z D 10�10 p 1C 1:32 � 1033t2 m;

which reaches the value of 103 m in just about 3ms! This essentially completes the discussion of the free particle wave packets, but

I cannot pass an opportunity to play with the Heisenberg picture whenever it is possible, and this is one of the simplest situations to showcase it. You can consider it as a reward to you for being such a good sport and wading with me through the tedious analysis of the Gaussian wave packet.

Recalling that the Hamiltonian of a free particle is just

OH D OP2 2me

;

you can easily derive the Heisenberg equations for the components of the position and momentum operators

dOr dt

D OP m

d OP dt

D 0

with obvious solution

Or D Or0 C OP0 m

t (6.25)

6.2 Rectangular Potential Wells and Barriers 177

where Or0 and OP0 are as usual Schrödinger picture’s operators setting initial condi- tions in the Heisenberg picture. Assuming that the particle is in some arbitrary state j�i, which does not change with time in the Heisenberg picture, I can immediately derive for the expectation value of the position operator:

hOri D hOr0i C D OP0 E

m t

which is, of course, a three-dimensional version of the expression for the expectation value of the coordinate found from Eq. 6.23 with z0 set to zero. Now, squaring Eq. 6.25 I get

Or2 D Or20 C OP20 m2

t2 C t m

� Or0 OP0 C OP0Or0

� :

Using position representation for the operators Or0, OP0, and Eq. 6.13 to represent state j�i, you can demonstrate by direct computations that

h�j Or0 OP0 C OP0Or0 j�i D 0 so that one has for the uncertainty of the position 4r2:

4r2 D 4r20 C 4p2 m2

t2; (6.26)

where 4p2 is again the uncertainty of the momentum computed with an initial state of the particle. If this initial state is Gaussian, and limiting Eq. 6.26 to just a single coordinate, you can use Eq. 6.15 to replace the momentum uncertainty with 4z0, which will yield Eq. 6.24 for the spreading of the wave packet. In addition to this, however, Eq. 6.26 demonstrates that the phenomenon of wave packet spreading is not limited only to one-dimensional Gaussian packets and is a general feature of free propagation of quantum particles.

6.2 Rectangular Potential Wells and Barriers

6.2.1 Potential Wells: Systems with Mixed Spectrum

The first important model, which I am going to introduce in this section, is characterized by a potential profile shown in Fig. 6.4, and which can be described as

V.z/ D (

Vw jzj < d=2 Vb jzj > d=2

(6.27)

178 6 One-Dimensional Models

Fig. 6.4 Rectangular potential well

where I assumed for concreteness that Vb > Vw. Such a potential profile is called a potential well. If one chooses to count the energy from the bottom of the well, then Vw ! 0, and Vb ! Vb � Vw. The energy levels in this potential must be separated in two different regions Vw < Ez < Vb and E > Vb with distinctly different types of behavior (states with E < Vw do not exist). In the former case, a classical particle would have been confined between the two “walls” of this potential well at jzj D d=2 bouncing back and forth, while in the latter, classical motion is unbounded (the particle can be anywhere along the Z-axis). As was already discussed in Sect. 5.1.3, two different types of classical behavior translate in to different quantum behaviors as well.

Bound States: Discrete Spectrum I will begin with spectral region Vw < Ez < Vb, which corresponds to bound classical motion, and where you should expect to see discrete spectrum of energy eigenvalues. The potential we are dealing with is a piecewise continuous function with finite jumps at z D ˙d=2. For the range of energies under consideration, spatial regions defined by jzj > d=2 are classically forbidden. Therefore, as it was discussed in Sect. 5.1.3, Eq. 5.37 must be complemented by the boundary conditions requiring that the wave function vanishes at z ! ˙1 and by continuity conditions at jzj D d=2.

However, before I start dirtying my hands and digging into the boring business of actually writing down the wave functions and matching the boundary conditions and all that, I want to play with the problem a little bit more and see if I can make this task a bit less boring. The nice thing about this particular potential is that it is symmetric with respect to inversion of coordinate z: V.�z/ D V.z/, and I hope that you recognize here your old acquaintance from Sect. 5.1.2—the parity transformation, O…V.z/ D V.�z/. And not only the potential is symmetric, but the boundary conditions are also symmetric: ' .z/ ! 0, when jzj ! 1. Since the kinetic energy was shown earlier to be always parity invariant (does not change upon the parity transformation), you can confidently conclude that the entire

6.2 Rectangular Potential Wells and Barriers 179

Hamiltonian of this system is symmetric with respect to this transformation. And as I have already explained in Sect. 5.1.2, it means that the Hamiltonian commutes with the parity operator, O…, so that the wave functions representing eigenvectors of this Hamiltonian also represent eigenvectors of O…. Wherefore, solutions of the Schrödinger equation 5.37 with potential given by Eq. 6.27 can be classified into even ('.�z/ D '.z/) and odd ('.�z/ D �'.z/) functions with the immediate consequence that you only need to deal with boundary and continuity conditions at z > 0. Indeed, the definite parity of the solutions, even or odd, ensures that the conditions for z < 0 are satisfied simultaneously with those at z > 0. Here is the power of the symmetry to you: I just cut the number of equations to be solved to satisfy the continuity conditions by half without even breaking a sweat! Using the symmetry arguments for such a simple problem might seem a bit as an overkill—it is not too difficult to solve it by simply using the brute force. I still wanted to show it to you so that you would be better prepared to understand the implications of symmetry in more “sanity-threatening situations.”

Because of the discontinuity of the potential at jzj D d=2, solutions for intervals jzj < d=2 and jzj > d=2 must be found independently and stitched afterward using continuity conditions.

1. jzj < d=2. The Schrödinger equation for this interval takes the form

d2'

dz2 D �2me„2 .Ez � Vw/ ' .z/

where Ez � Vw > 0. This equation is similar to that of the free particle with positive energy Ez � Vw, and its most general solution has, therefore, the form

'.z/ D Aeikz C Be�ikz D QA sin kz C QB cos kz

where

k D p 2me .Ez � Vw/=„ (6.28)

is a real quantity. The choice of the exponential or trigonometric functions to rep- resent this solution is a matter of one’s taste and/or convenience: the expressions are equivalent with QA D i.A � B/ and QB D A C B. However, since we know that '.z/ must have a definite parity, the trigonometric form is more convenient to take advantage of this insight. Indeed, in order to generate an even solution I can simply make QA D 0, while an odd solution is obtained by choosing QB D 0:

'e.z/ D B cos kz; jzj < d=2 (6.29) 'o.z/ D A sin kz; jzj < d=2: (6.30)

Notation for the remaining coefficients is irrelevant, so I dropped the tildes above the letters.

180 6 One-Dimensional Models

2. jzj > d=2. In this case, the right-hand side of the corresponding Schrödinger equation

d2'

dz2 D 2me„2 .Vb � Ez/ ' .z/

is positive for the range of energies under consideration (Vb � Ez > 0/ so that the general solution of this equation is given by

'.z/ D Ce z C De� z (6.31)

where now

D p 2me .Vb � Ez/=„ (6.32)

is a real quantity. As has already been mentioned, the solution of the Schrödinger equation in the region z > d=2 must vanish at z ! C1 (Eq. 5.36). The function presented by Eq. 6.31 satisfies this requirement only if the exponentially growing term is gotten rid of, which I achieve by simply requiring that C D 0. Thus, the wave function for z > d=2 becomes

'.z/ D De� z; z > d=2 (6.33)

for both even and odd solutions. For negative values of coordinates z < �d=2, Eq. 6.33 would produce for even and odd solutions correspondingly:

'e.z/ D De z (6.34) 'o.z/ D �De z: (6.35)

Before continuing with stitching the wave functions at z D d=2, let me point out that the solutions in the classically allowed region jzj < d=2 are presented by oscillating functions. Upon crossing to the classically forbidden region jzj > d=2, the oscillating character of the solutions turns into a monotonic decrease. This example illustrates generic properties discussed in Sect. 5.1.3 and can be used to formulate a general rule of thumb applied to any piecewise constant potential: in the classically allowed regions, the wave function is represented by combination of trigonometric function, while in classically forbidden regions, the solution is given by combination of exponential functions with a real argument. However, I need to warn you to pay attention to the fact that I was able to eliminate the exponentially growing terms in Eqs. 6.33–6.35 only because the classically forbidden region extended all the way to positive or negative infinities. If, as it might happen in certain problems, the potential would have another jump and a classically forbidden region would have crossed over to a classically allowed region, you would have to

6.2 Rectangular Potential Wells and Barriers 181

keep both growing and decreasing exponential functions because the conditions at infinity can only be used within a region of coordinates extending, well, to infinity.

Equation 6.33 must be stitched with either Eq. 6.29 or 6.30 to generate continuous solution describing the wave function in the entire domain of the coordinate z. This must be done separately for even and odd solutions. In the latter case, the continuity of the wave function and of its derivative at z D d=2 requires that

B cos kd

2 D De� d=2 (6.36)

�Bk sin kd 2

D � De� d=2: (6.37)

For arbitrary values of Ez, which appears in these equations via parameters k and , Eqs. 6.36 and 6.37 can have only trivial solution B D D D 0. Obviously, this is not what we want. However, if I insist on having non-zero solutions, I must impose a special condition on the allowed values of k and . One way to derive this condition is to divide Eq. 6.36 by Eq. 6.37 (this is allowed because we require that B;D ¤ 0) yielding

k tan kd

2 D : (6.38)

Taking into account Eqs. 6.28 and 6.32, you can recognize Eq. 6.38 is a transcen- dental equation for energy Ez. Solutions of this equation determine the values of energy permitting the existence of non-zero coefficients B and D and, hence, of the wave functions satisfying all the boundary conditions. The solutions of Eq. 6.38 are obviously eigenvalues of the Hamiltonian also called allowed energy values or energy levels.

Equation 6.38 does not submit to an analytical solution, but it still can be qualitatively analyzed to help you to determine, at least, the number of solutions it might have. To this end, it is convenient to rewrite this equation introducing dimensionless variables, such as kd=2 and d=2. To facilitate the transition to these variables, I first compute k2 C 2 using Eqs. 6.28 and 6.32:

k2 C 2 D 2me .Vb � Vw/„2 :

Multiplying this expression by d2=4 and introducing dimensionless " for kd=2, I have

"2 C 2d2

4 D me .Vb � Vw/ d

2

2„2 ) d

2 D q "20 � "2;

182 6 One-Dimensional Models

where I introduced another dimensionless parameter "0 defined as

"0 D d„

r me .Vb � Vw/

2 :

Multiplying both sides of Eq. 6.38 by d=2, I can rewrite it now as

tan " D q "20 � "2 "

: (6.39)

You can see that "0 incorporates all relevant parameters of the system, solely determining the allowed energy values. This is an excellent illustration of the power of dimensionless variables: four different parameters have collapsed into a single one, which rules them all. Without even solving the equation, I know now that all rectangular potential wells with different values of me, Vb � Vw and d will have the same dimensionless energy levels as long as all these parameters correspond to the same "0.

Obviously, Eq. 6.39 only makes sense for " � "0, so it is important to understand the physical meaning of this condition. Substituting all necessary definitions, you can see that " D "0 turns into

me .Ez � Vw/ d2 2„2 D

me .Vb � Vw/ d2 2„2 ) Ez D Vb;

i.e., at the point " D "0 an energy crosses over the potential barrier, where assumptions used to derive Eq. 6.39 lose their validity.

In order to understand solutions to Eq. 6.39, it is useful to visualize graphs of its left-hand and right-hand sides. As " increases from zero, the function on the right decreases from positive infinity to zero at " D "0, where it terminates. The left- hand side is a tangent, which grows from zero at " D 0 and reaches its asymptotic behavior at " D �=2, where it jumps all the way to negative infinity, starts its climb toward the next zero at � , goes to infinity at 3�=2, and so on. If "0 < � , the two functions will cross only once because the right-hand side will end before the left-hand side manages to get to the positive territory again. Once "0, however, crosses the � threshold, the second crossing becomes possible, and one more when "0 exceeds 2� , and so on. An important point here is that at least one even solution always exists no matter how small "0 becomes. Another important qualitative point one may take home is that the magnitude of "0 depends on two main parameters: the depth of the well Vb � Vw and its geometric width d; the wider and deeper wells would be able to accommodate a large number of allowed energy levels, and in order to decrease the number of energy eigenvalues belonging to the discrete spectrum, one can either make the well narrower or shallower. It is important to remember though that the geometric width of the well affects not only values of allowed energies but also the difference between adjacent energy levels, which are closer to each other in wider wells.

6.2 Rectangular Potential Wells and Barriers 183

The stitching conditions for the odd wave functions take the form

A sin kd

2 D De� d=2 (6.40)

Ak cos kd

2 D � De� d=2 (6.41)

resulting to a different equation for allowed energy values

cot " D � q "20 � "2 "

: (6.42)

The right-hand side of this equation changes from negative infinity to zero at " D "0, while the left-hand side begins at positive infinity at " D 0 and crosses to the negative territory only for " > �=2. Thus, if "0 < �=2, this equation has no solutions. In this case, the only allowed eigenvalue of energy corresponds to a single even solution for the wave function. Increasing "0 beyond �=2 will produce the first odd solution, and it will happen before the second even solution appears. Following this line of reasoning, you can see that with increasing "0, eigenvalues corresponding to odd and even wave functions appear in an alternating manner.

The state corresponding to the lowest energy is called the ground state, and as you just saw, it is always represented by an even function. The next energy corresponds to an odd solution, then you have again an energy level corresponding to the even solution, then to an odd one again, and this pattern repeats until the last allowed eigenvalue is reached, which can be either odd or even.

All solutions of Eqs. 6.39 and 6.42 can be enumerated as "n, where n D 1 corresponds to the ground state (even) solution, n D 2 to the lowest in energy odd solution, and so on and so forth. Similar enumeration can be applied to the wave functions

'n.z/ D (

Bn cos knz jzj < d=2 Bn cos .knd=2/ e nd=2e� njzj jzj > d=2

; n D 1; 3; 5 : : : (6.43)

and

'n.z/ D

8̂ <̂ ˆ̂:

An sin knz jzj < d=2 An sin .knd=2/ e nd=2e� nz z > d=2 �An sin .knd=2/ e nd=2e nz z < �d=2

; n D 2; 4; 6 : : : : (6.44)

Here, kn D 2"n=d, n D 2 q "20 � "2n=d, while the corresponding values of energy

Ezn are

Ezn D Vw C 2„ 2

md2 "2n: (6.45)

184 6 One-Dimensional Models

Fig. 6.5 Graphic solution of the eigenvalue equation for even and odd wave functions

You may notice that wave functions in Eqs. 6.43 and 6.44 still contain undefined coefficients Bn and An correspondingly, while coefficient D was eliminated using Eqs. 6.36 and 6.40. This is, however, normal because all eigenvectors and represent- ing their functions are always defined only up to a constant factor (I have said that already and not once, right?). As usual, values of these coefficients can be fixed by the normalization condition

1̂

�1 '2n.z/dz D 1:

Figure 6.5 illustrates the process of emergence of the even and odd solutions described above. The graph on the left refers to Eq. 6.39 for energies of even states, and the graph on the right corresponds to Eq. 6.44 for energies of the odd states. One can see that the crossing points on the graphs signifying values of the dimensionless energy parameter " alternate in their values between even and odd states: the lowest energy value comes from the graph on the left, the second lowest appears in the graph on the right, and this alternation continues throughout all the ten energy values depicted in these plots. It is also instructive to plot the wave functions corresponding to a few lowest energy eigenvalues. Graphs in Fig. 6.6 present (from left to right) the ground state and the, first and second excited states. In addition to clearly demonstrating the even and odd nature of the respective states, these graphs reveal an important phenomenon—a transition to a higher energy level always adds an extra zero to the corresponding wave function. This behavior is actually a manifestation of a mathematical theorem valid for any one-dimensional problems with discrete spectrum: the number of zeroes of a wave function corresponding to n-th energy level (n D 1 corresponds to the ground state) is always equal to n � 1. Since a rigorous proof of this statement is beyond our reach, I will illustrate this point considering a limiting case of a very deep well, such that "0 1. In the limit "0 ! 1, Eq. 6.39 has solutions "n D �n=2; n D 1; 3; 5 � � � , while solutions of Eq. 6.42 are "n D �n=2; n D 2; 4; 6 � � � . The corresponding energy values from Eq. 6.45 coincide with those given in Eq. 5.89 (if one replaces L with d) for a

6.2 Rectangular Potential Wells and Barriers 185

Fig. 6.6 Wave functions corresponding to the first three lowest energy eigenvalues of a rectangular potential well

particle, whose motion is confined in a finite region of the total length d. Obviously, this confinement corresponds to the limit of the potential well with infinitely high barriers. The wave function in this case becomes

'n.z/ D (

Bn cos �nzd n D 1; 3; 5 � � � An sin �nzd n D 2; 4; 6 � � �

within the well jzj < d=2, and it is exact zero outside of the well ( n goes to infinity and vanquishes the exponential terms exp n .�z C d=2/ for all z > d=2 and exp n .z C d=2/ for all z < �d=2). Now, one can clearly see how the increase of n by 1 transforms cos into sin adding an extra zero to the function when z changes from �d=2 to d=2. Unbound (Scattering) States: Continuous Spectrum The range of energies satisfying the condition Ez > Eb corresponds to an unbound classical motion, where the entire domain �1 < z < 1 becomes classically allowed. The motion of the classical particle depends on the combination of its initial position and initial velocity: if a particle is initially at z < �d=2 with velocity directed to the left or at z > d=2 with positive velocity, it will keep moving with the

186 6 One-Dimensional Models

same velocity—the potential well would not affect its motion at all. If, however, the initial motion of the particle is directed toward the well, it will experience infinite acceleration (or deceleration) for infinitesimally short time interval when passing points z D ˙d=2, which will result in finite increase and then decrease of the particle’s speed. After passing the region of the well, the particle resumes its straight linear motion with the same velocity as before.

Quantum mechanical behavior of the particle is described by the solution of the Schrödinger equation, which in the classically allowed region can be presented as a combination of exponential functions with complex arguments. In this spectral region, the symmetry arguments, which I used to find discrete energy levels and the corresponding wave functions, are no longer valid because of the inherent asymmetry in the initial conditions. You will see soon that this asymmetry, which is evident in the classical description of the unbound motion, will manifest itself in the quantum description as well. Therefore, it is no longer necessary to keep the origin of the coordinate axis at the center of the well, and the consideration becomes a bit more convenient if I move it to the left by d=2: In this case, the left boundary of the well corresponds to z D 0, and the coordinate regions, for which different wave functions must be written, are now defined as z < 0, 0 < z < d, and z > d. The most general solution for the wave function in each of these regions can be written down as

'.z/ D

8̂ <̂ ˆ̂:

A1eik1z C B1e�ik1z z < 0 A2eik2z C B2e�ik2z 0 < z < d A3eik1z C B3e�ik1z z > d

(6.46)

where k1 D p 2m .Ez � Vb/ and k2 D

p 2m .Ez � Vw/. This expression for the wave

function has to be complemented by four stitching conditions—two at each of the points of discontinuity. Requiring continuity of the function and its derivative at z D 0 and z D d, I get

A1 C B1 D A2 C B2 (6.47) k1 .A1 � B1/ D k2 .A2 � B2/ (6.48)

A2e ik2d C B2e�ik2d D A3eik1d C B3e�ik1d (6.49)

k2 � A2e

ik2d � B2e�ik2d � D k1

� A3e

ik1d � B3e�ik1d �

(6.50)

where Eqs. 6.47 and 6.49 ensure continuity of the wave function at z D 0 and z D d correspondingly, while Eqs. 6.48 and 6.50 do the same for its derivative. Simple counting of the number of unknown coefficients and comparing it with the number of equations tells me that I have got a problem here: there are only four equations for six unknowns, which is one unknown too many. However, I have not yet specified a desirable behavior of the wave function at infinity (a boundary condition), which can be useful in eliminating extra unknowns. Unlike the case of the discrete spectrum,

6.2 Rectangular Potential Wells and Barriers 187

where the behavior of the wave functions at infinity is uniquely prescribed, here I have an array of choices reflecting different physical situations for which the problem at hand is being used.

Before digging into the issue of the boundary conditions at infinity for this problem, it might be useful to get a better physical understanding of the terms appearing in the expressions for '.z/. To this end, let me dust off some results from Sect. 5.1.3, namely, the concept of the probability current, Eq. 5.42, which in its one-dimensional reincarnation takes the form of

j D i„ 2me

'

d'�

dz � '� d'

dz

: (6.51)

Substituting a generic form of the wave function Aieikiz C Bie�ikiz into Eq. 6.51, you find

j D i„ 2me

��iki � Aie

ikiz C Bie�ikiz � �

A�i e�ikiz � B�i eikiz �

�iki � A�i e�ikiz C B�i eikiz

� � Aie

ikiz � Bieikiz �� D

„ki 2me

� jAij2 � jBij2 C BiA�i e�2ikiz � B�i Aie2ikiz C

C jAij2 � jBij2 � BiA�i e�2ikiz C B�i Aie2ikiz �

D „ki me

jAij2 � „ki me

jBij2 :

The first term in this expression describes a positive (directed in positive z direction) probability current associated with term Aieikiz in the wave function, while the second, negative, term describes a probability current in the opposite, negative z direction and is associated with the term Bie�ikiz in the wave function. Now, imagine a classical beam of particles of mass me all moving with the same speed v in the positive direction of the z-axis. The current of particles in this beam (a number of particles crossing a plane perpendicular to the flow per unit time per unit area of the cross section of the beam) is easily found to be Nv, where N is the number of particles in the beam per beam’s unit volume. This expression coincides with the quantum mechanical probability current if you replace v with p=me, p with „k and identify jAj2 with N. This comparison allows interpreting the terms of the wave function containing eikz as corresponding to the beam of particles propagating from left to right and terms with e�ikz as describing particles propagating in the opposite direction.

A typical experiment involving particles with energies in the continuous segment of the spectrum consists in sending particles created by some source, positioned far away from the potential well, toward the well and counting the number of particles in the beam behind the well (transmitted particles) or the number of particles in

188 6 One-Dimensional Models

front of the well but propagating in the negative z direction (reflected particles). In this case, the asymptotic behavior of the wave function at negative infinity must contain both left- and right-propagating currents, while the wave function at the positive infinity only contains the right-propagating particles. This gives us one of the possible boundary conditions at infinity corresponding to this particular experimental situation: ' .z ! 1/ D A3eik1z. For the wave function to have this form, coefficient B3 in Eq. 6.46 must be set to zero. As a result, I end up with five unknown coefficients and the same four equations, and what is left to realize is that the term A1eik1z describes the current of particles created by the source, which is external to the Schrödinger equation and is determined by an experimentalist controlling the concentration of particles in the outgoing beam. Thus, A1 shall be treated as a free parameter, while all remaining coefficients must be expressed in its terms.

Quantities actually measured in the experiment, i.e., the fraction of particles reflected by the potential or the fraction of particles transmitted past the potential, can be interpreted quantum mechanically as probabilities of reflection R D jr=jinc and transmission T D jtr=jinc, where I introduced notations for the reflected current jr D „k1 jB1j2 =me, the incident current jinc D „k1 jA1j2 =me, and transmitted current jr D „k3 jA3j2 =me. Wave number k3 D

p 2me.Ez � V1/ is determined by the value

of the potential at z ! 1, V1. In the particular case I am dealing with now, the potentials at z < 0 and z > d are the same, so that V1 D Vb and k3 D k1. One should realize, however, that it is not always the case, so that one has to be careful when defining the transmission probability. The most general expressions for the reflection and transmission probabilities are

R D jB1j 2

jA1j2 (6.52)

T D k3 jA3j 2

k1 jA1j2 : (6.53)

Now it becomes clear that in order to obtain experimentally relevant form for the scattering wave function, you need to solve the following system of equations expressing all unknown coefficients in terms of amplitude of the incident particles A1:

1C r D A2 C B2; (6.54) k1 .1 � r/ D k2 .A2 � B2/ ; (6.55)

A2e ik2d C B2e�ik2d D teik1d; (6.56)

k2 � A2e

ik2d � B2e�ik2d � D k1teik1d: (6.57)

6.2 Rectangular Potential Wells and Barriers 189

Here, I introduced the amplitude reflection and transmission coefficients r D B2=A1 and t D A3=A1 correspondingly and redefined amplitudes A2 and B2 as A2=A1 ! A2, and B2=A1 ! B2. Combining the first two equations, I obtain

1C k1

k2

C 1 � k1

k2

r D 2A2;

1 � k1

k2

C 1C k1

k2

r D 2B2;

while the other two yield

1C k1

k2

teik1d D 2A2eik2d;

1 � k1

k2

teik1d D 2B2e�ik2d:

Expressing A2 and B2 from the last pair of equations and substituting it in the first ones, I get

1C k1

k2

C 1 � k1

k2

r D t

1C k1

k2

eik1de�ik2d;

1 � k1

k2

C 1C k1

k2

r D t

1 � k1

k2

eik1deik2d;

which after some brushing up yields

1C r k2 � k1

k2 C k1 �

e�ik1deik2d D t; 1C r k2 C k1

k2 � k1 �

e�ik1de�ik2d D t:

Equating the left-hand sides of these two equations gives

e�ik1deik2d 1C k2 � k1

k2 C k1 r

D e�ik1de�ik2d 1C k2 C k1

k2 � k1 r

)

r

k2 C k1 k2 � k1 e

�2ik2d � k2 � k1 k2 C k1

D 1 � e�2ik2d:

Finally, some simple algebraic manipulations, which I hope you can reproduce yourselves, yield

190 6 One-Dimensional Models

r D � k21 � k22

� sin k2d�

k22 C k21 �

sin k2d C 2ik2k1 cos k2d ; (6.58)

t D 2ik2k1� k22 C k21

� sin k2d C 2ik2k1 cos k2d

: (6.59)

Now, you can easily find two remaining coefficients A2 and B2:

A2 D 1 2

1C k1

k2

ei.k1�k2/d

2ik2k1� k22 C k21

� sin k2d C 2ik2k1 cos k2d

; (6.60)

B2 D 1 2

1 � k1

k2

ei.k1Ck2/d

2ik2k1� k22 C k21

� sin k2d C 2ik2k1 cos k2d

: (6.61)

Now, once the expressions for the coefficients of the wave function are found in terms of A1, you might wonder if it is possible and/or necessary to fix the value of the latter. Generally speaking, this is again a question of normalization of the wave function, and according to our general understanding, we must be able to normalize this function using the delta-function. However, when the wave function is not just a plane wave, the procedure becomes rather cumbersome and requires careful evaluation of diverging integrals. From a practical point of view, it does not make much sense to jump from all these hoops to achieve the normalization, which would matter only if you plan to use the resulting functions as a basis, and this almost never happens. Thus, if you are only concerned with obtaining experimentally relevant quantities, you will be happy leaving A1 undetermined and use Eqs. 6.58 and 6.59 to find the transmission and reflection probabilities from Eqs. 6.52 and 6.53:

R D � k21 � k22

�2 sin2 k2d�

k22 C k21 �2

sin2 k2d C 4k22k21 cos2 k2d (6.62)

T D 4k 2 2k 2 1�

k22 C k21 �2

sin2 k2d C 4k22k21 cos2 k2d : (6.63)

The denominator of these expressions can be rewritten in the following form:

� k22 C k21

�2 sin2 k2d C 4k22k21 cos2 k2d D 4k22k21 C

� k21 � k22

�2 sin2 k2d:

Thanks to this rearrangement, you can realize two important facts. First, you can immediately see that

R C T D 1 (6.64)

6.2 Rectangular Potential Wells and Barriers 191

and, second, that the transmission, considered as a function of energy, oscillates between its maximum value equal to unity, achieved at k2d D �n; n D 1; 2; 3 � � � , and its minimum value

Tmin D 4k 2 2k 2 1�

k22 C k21 �2

which occurs at k2d D �=2 C �n. For large values of energy Ez Vb, when k1 and k2 become close to each other, the minimum value of transmission differs little from unity, so that the transmission remains close to one for almost all energies. The reflection probability, in this case, becomes correspondingly small for all energies as well. This is the behavior close to what you would expect from a classical particle, so that the higher energy limit means transition to the classical regime. This behavior is illustrated in Fig. 6.7.

Equation 6.64 is an important expression of the conservation of probability— it simply states that since transmission and reflection are the only two mutually exclusive events that can occur when a particle is incident on the potential, the sum of their probabilities must be equal to unity. Even though this relation was derived here for the particular case of the rectangular well, it is valid for a generic potential asymptotically approaching a constant value at z ! ˙1. The validity of Eq. 6.64 serves in reality as a test on correctness of Eqs. 6.62 and 6.63. Using the definitions of transmission and reflection coefficients in terms of the probability currents, I can rewrite Eq. 6.64 as

jr jinc

C jtr jinc

D 1 ” jinc � jr D jtr: (6.65)

This equation establishes that the total probability current on the left of the potential well is equal to the probability current on its right, which is just a general statement of the conservation of probability, which can also be interpreted as a continuity of the probability current across any finite discontinuity of the potential.

Fig. 6.7 Transmission probability for the rectangular potential well

192 6 One-Dimensional Models

Fig. 6.8 Spatial dependence of j'.z/j2 for three different values of energy: low energy, high energy, and resonance energy where transmission goes to one and reflection to zero. The absence of reflection in the last plot is evidenced by the absence of oscillations of the probability density due to interference of the incident and reflected waves. The vertical lines delineate the edges of the well

The behavior of the wave function also changes with energy. Figure 6.8 illustrates this point plotting the spatial dependence of the respective probability density j'.z/j2 at three different energies, including the one which corresponds to zero reflection. In the latter case, the probability distribution becomes flat at both z < �d=2 and z > d=2 signaling the absence of interference between incident and reflected waves. One can also notice the decrease in the period of the oscillations for higher energies as it should be expected because higher energy means large wave number and shorter wavelength.

6.2.2 Square Potential Barrier

Square potential barrier is a potential well turned upside down, when the higher value of the potential energy Vb is limited to the finite interval jzj < d=2, while the lower energy Vw corresponds to the semi-infinite regions jzj > d=2 outside of this

6.2 Rectangular Potential Wells and Barriers 193

interval. The first principal difference between this situation and the one considered in the previous section is that there are no energies corresponding to a classically bound motion in this potential, and, therefore, there are no states corresponding to discrete energy levels. In both cases, Ez < Vb and Ez > Vb, classical motion is unbound, and quantum mechanical states belong to the continuous spectrum (there are no states with Ez < Vw). The difference between these energy regions is that in the former case, the interval jzj < d=2 is classically forbidden, while in the latter, the entire domain of z-coordinate is classically allowed. Respectively, there are two different types of wave functions: when Vw < Ez < Vb

'.z/ D

8̂ <̂ ˆ̂:

A1eik2z C B1e�ik2z z < �d=2 A2e 1z C B2e� 1z �d=2 < z < d=2 A3eik2z z > d=2

(6.66)

where k2 is defined as in the previous section, while 1 D p 2m .Vw � Ez/ is related

to k1 as k1 D �i 1. I already mentioned it once, but I would like to emphasize again—you cannot eliminate either of the real exponential functions in the second line of Eq. 6.66 because the requirement for the wave function to decay at infinity can be used only when the classically forbidden region expands to infinity. In the case at hand, it is limited to the region jzj < d=2, so the exponential growth of the wave function does not have enough “room” to become a problem. For energies Ez > Vb, the wave function has the form of

'.z/ D

8̂ <̂ ˆ̂:

A1eik2z C B1e�ik2z z < �d=2 A2eik1z C B2e�ik1z �d=2 < z < d=2 A3eik2z z > d=2:

(6.67)

This wave function is essentially equivalent to the one considered in the case of the potential well, so you can simply copy Eqs. 6.58 and 6.59 while exchanging k1 and k2:

B1 D e�ik2d

� k22 � k21

� sin k1d�

k22 C k21 �

sin k1d C 2ik2k1 cos k1d (6.68)

A3 D 2ik2k1e �ik2d

� k22 C k21

� sin k1d C 2ik2k1 cos k1d

: (6.69)

Transmission and reflection coefficients in this case have all the same properties as in the case of the potential well, which I am not going to repeat again.

Going back to the case Vw < Ez < Vb, it might appear that here I would have to carry out all the calculations from scratch because now I have to deal with real exponential functions. But fear not, you still can use the previous result by replacing k1 with k1 D �i 1. The negative sign in this expression is important—it ensures

194 6 One-Dimensional Models

that the coefficient A2 in Eq. 6.67 goes over to the same coefficient A2 in Eq. 6.66 (the same obviously applies to coefficients B2). In order to finish the transformation of Eqs. 6.68 and 6.69 for the under-the-barrier case, you just need to recall that sin.ix/ D i sinh x and cos.ix/ D cosh x. With these relations in mind, you easily obtain

B1 D � ie�ik2d

� k22 C 21

� sinh 1d

�i �k22 � 21 �

sinh 1d C 2k2 1 cosh 1d (6.70)

A3 D 2k2 1e �ik2d

�i �k22 � 21 �

sinh 1d C 2k2 1 cosh 1d : (6.71)

The respective transmission and reflection coefficients become

R D � k22 C 21

�2 sinh2 1d�

k22 � 21 �2

sinh2 1d C 4k22 21 cosh2 1d (6.72)

T D 4k 2 2 2 1�

k22 � 21 �2

sinh2 1d C 4k22 21 cosh2 1d : (6.73)

Even though I derived Eqs. 6.70 and 6.71 by merely extending Eqs. 6.68 and 6.69 to the region of imaginary k1 (for mathematically sophisticated—this procedure is a simple example of what is known in mathematics as analytical continuation), the properties of the reflection and transmission coefficients given by Eq. 6.72 are very different from those derived for the over-the-barrier transmission case Ez > Vb. Gone are their periodic dependence on the energy and d, as well as special values of energy, when the transmission turns to unity and reflection goes to zero. What do we have instead? Actually quite a boring picture: transmission is exponentially decreasing with increasing width of the barrier d and slowly approaches unity as the energy swings between Vw and Vb because 1 D

p 2me .Vb � Ez/ vanishes at

Ez D Vb. To illustrate the exponential dependence of the transmission on d, I will consider a case of a “thick” barrier, which in mathematical language means 1d 1. To find the required approximate expression for T and R, I need to remind you a simple property of hyperbolic functions cosh x and sinh x: for large values of their argument x, these functions can be approximated by a simple exponential, sinh x ' cosh x ' 1

2 exp x. Taking this into account, I can derive

R ' 1 (6.74)

T ' 4k 2 2 2 1�

k22 C 21 �2 e�4 1d: (6.75)

When deriving the expression for the reflection coefficient, I “lost” the exponentially small term, which is supposed to be subtracted from unity to ensure conservation of probability. At the same time, this term makes the main contribution to the

6.3 Delta-Functional Potential 195

transmission coefficient and, therefore, survives. A better approximation for the reflection coefficient can be found simply by writing it down as R D 1 � T . Obviously, the same results can be derived directly from Eq. 6.72 by being a bit more careful and keeping leading exponentially small terms.

What is surprising here is, of course, not the fact that the transmission is small, but that it is not exactly equal to zero. Because what it means is that there exists a non-zero probability for the particle to travel across a classically forbidden region, emerge on the other side, and keep moving as a free particle. This phenomenon, which is a quantum mechanical version of “walking through the wall,” is called tunneling, and you can hear physicists saying that the particle tunnels through the barrier. The exponential nature of the dependence upon d is very important, because exponential function is one of the fastest changing functions appearing in mathematical description of natural processes. It means that a small change in d results in a substantial change in transmission. This effect has vast practical importance and is used in many applications such as tunneling diodes, tunneling microscopy, flush memory, etc.

6.3 Delta-Functional Potential

In this section, I will present a rather peculiar model potential, which does not really have direct analogies in the real world. I can justify spending some time on it by making three simple points: (a) it is easily solvable, so considering it would not take too much of our time, (b) it is useful as an illustration of a situation when the derivative of the wave function loses its continuity property, and (c) in the case of shallow potential wells, which are able to hold only a single bound state, it can provide a decent qualitative understanding of real physical situations. This utterly unrealistic potential has the form of a delta-function

V D �&ı.z/; (6.76)

where the negative sign signifies that the potential is attractive and that the states with negative energies are possible and must belong to the discrete spectrum. Indeed, the entire region of z except of a single point z D 0 is classically forbidden, so the motion of a classical particle, if one can imagine being localized to a single point as a motion, is finite. Parameter & in this expression represents a “strength” of the potential, but one needs to understand that the dimension of this parameter is energy � length, so it should not be interpreted as a “magnitude” of the potential. It becomes obvious if one integrates Eq. 6.76: & D ´ V.z/dz, so it is clear that & is the area under the potential. If one thinks of the delta-function as a limiting case of a rectangular potential of depth Vw and width d, with Vw ! 1 and d ! 0, in such a way that & D Vwd remains constant, the meaning of this parameter becomes even more transparent.

196 6 One-Dimensional Models

The main peculiarity of this model is that the discontinuity of the potential in this case involves more than just a finite jump, so that my previous arguments concerning the continuity of the derivative of the wave function are no longer applicable. Actually, this derivative is not continuous at all, and the first matter of business is to figure out how to “stitch” derivatives of the wave function defined at z < 0 with those defined at z > 0. To solve this puzzle, let me start with the basics—the Schrödinger equation

� „ 2

2me

d2'

dz2 � &ı.z/'.z/ D E'.z/: (6.77)

Integrating this equation over infinitesimally small interval ��; � and taking into account that an integral of a continuous function over such an interval is zero (in the limit � ! 0), I get

� „ 2

2me

d'

dz

ˇ̌ ˇ̌ zD�

� d' dz

ˇ̌ ˇ̌ zD��

� &'.0/ D 0:

This yields the derivative stitching rule:

d'

dz

ˇ̌ ˇ̌ zD�

� d' dz

ˇ̌ ˇ̌ zD��

D �2me„2 &'.0/: (6.78)

Now, all what I need is to solve the Schrödinger equation with zero potential and negative energy separately for z < 0 and z > 0 and stitch the solutions. Since both these regions are classically forbidden for a particle with E < 0, the solutions have the form of real-valued exponential functions:

'.z/ D (

A1e z z < 0

A2e� z z > 0

where D p�2meE=„, and I discarded the contributions which would grow exponentially at positive and negative infinities to satisfy the boundary conditions. A continuity of the wave function at z D 0 requires that A2 D A1, and Eq. 6.78 yields

�2 A D �2me„2 &A:

Assuming that A is non-zero (naturally) and taking into account the definition of , I find that this expression is reduced to the equation for allowed energy levels:

E D �me& 2

2„2 : (6.79)

6.3 Delta-Functional Potential 197

Obviously, Eq. 6.79 shows that there is only one such energy, which is why this model can only be useful for description of shallow potential wells with a single discrete energy level.

Solutions with positive energies can be constructed in the same way as it was done for the rectangular potential well or barrier

'.z/ D (

A1eikz C B1e�ikz z < 0 A2eikz z > 0

(6.80)

where k D p2meE=„; and the continuity of the wave function at z D 0 yields

A1 C B1 D A2:

The derivative stitching condition, Eq. 6.78, generates the following equation:

ikA2 � ik .A1 � B1/ D �2me„2 &A2:

Solving these two equations for B1 and A2, one can obtain

A2 A1

D 1 1 � i�;

B1 A1

D i� 1 � i�;

where I introduced a convenient dimensionless parameter � defined as

� D me& k„2 :

The amplitude transmission and reflection coefficients t D A2=A1 and r D B1=A1 are complex numbers, which can be presented in the exponential form using Euler formula as

r D pRei�r I t D pTei�t

where reflection and transmission probabilities R D jrj2 and T D jtj2 and corresponding phases �r and �t are given by

R D � 2

1C �2 I T D 1

1C �2 (6.81)

�r D � arctan 1 �

I �t D arctan�: (6.82)

198 6 One-Dimensional Models

While the phase of the amplitude reflection and transmission coefficients do not affect the probabilities, they still play an important role and can be observed. The reflected wave function interferes with the function describing incident particles and determines the spatial distribution of relative probabilities of position mea- surements. These phases also define the temporal behavior of the particles in the situations involving nonstationary states, but discussion of this situation is outside of the scope of this book.

6.4 Problems

Problems for Sect. 6.2.1

Problem 77 Derive Eq. 6.42.

Problem 78 Find the reflection and transmission coefficients for the potential barrier shown in Fig. 6.9. Show that R C T D 1. Problem 79 In quantum tunneling, the penetration probability is sensitive to slight changes in the height and/or width of the barrier. Consider an electron with energy E D 15 eV incident on a rectangular barrier of height V D 7 eV and width d D 1:8 nm. By what factor does the penetration probability change if the width is decreased to d D 1:7 nm? Problem 80 Consider a step potential

V.z/ D ( 0 x < 0

V0 x > 0:

Calculate the reflection and transmission probabilities for two cases 0 < E < V0 and E > V0.

Fig. 6.9 Potential barrier with an asymmetric potential

6.4 Problems 199

Problem 81 Find an equation for the energy levels of a particle of mass me moving in a potential of the form

V.z/ D

8 ˆ̂̂̂ ˆ̂̂< ˆ̂̂̂ ˆ̂̂:

1 x < �a 0 �a < x < �b V0 �b < x < b 0 b < x < a

1 x > a:

Consider even and odd wave functions separately. Using any graphic software, find the approximate values of the two lowest values of the energy if m D 1:78 � 10�27 kg, a D 0:12 nm, b D 0:42 nm, V0 D 1:5 eV. Sketch the respective wave functions for each of the found eigenvalues.

Problem 82 Consider a particle moving in a potential comprised of two attractive delta-functional potentials separated by a distance d:

V.x/ D �&ı .x C d=2/ � &ı .x � d=2/ :

1. Derive an equation for discrete energy levels in this potential, and solve it if possible. How many discrete energy levels does this potential have? Analyze the behavior of these energy levels when the distance d between the wells increases.

2. Find the wave functions corresponding to the continuous segment of the spec- trum, and determine the respective transmission and reflection probabilities.

Chapter 7 Harmonic Oscillator Models

It is as difficult to overestimate the role of harmonic oscillator models in physics in general and in quantum mechanics in particular as the influence of Beatles and Led Zeppelin on modern popular music. Harmonic oscillators are ubiquitous and appear every time when one is dealing with a system that has a state of equilibrium in the vicinity of which it can oscillate, i.e., in a vast majority of physical systems—atoms, molecules, solids, electromagnetic field, etc. It also does not hurt their popularity that the harmonic oscillator is one of the very few models which can be solved exactly.

Consider a particle moving in a potential V.x; y; z/, which has a minimum at some point x D y D z D 0. Mathematically speaking, this means that at this point @[email protected] D @[email protected] D @[email protected] D 0, while the matrix of the second derivatives Lij � @[email protected]@rj

ˇ̌ xDyDzD0, where r1 � x; r2 � y, and r3 � z; is positive definite.

If you still remember the connection between the potential energy and the force in classical mechanics, you should recognize that in this situation, point x D y D z D 0 corresponds to the particle being in the state of stable equilibrium. Stable in this context means that a particle removed from the equilibrium by a small distance will be forced to move back toward it rather than away from it. Expanding potential energy in a power series in the vicinity of the equilibrium and keeping only the first nonvanishing terms, you will get

V.x; y; z/ � 1 2

X i;j

Li;;jrirj:

Respective classical Hamiltonian equations 3.2 and 3.3 yield for this potential:

dpi dt

D � X

j

Li;jrj (7.1)

© Springer International Publishing AG, part of Springer Nature 2018 L.I. Deych, Advanced Undergraduate Quantum Mechanics, https://doi.org/10.1007/978-3-319-71550-6_7

201

202 7 Harmonic Oscillator Models

dri dt

D pi me

(7.2)

(i D x; y; z). They can be converted into Newton’s equations by differentiating Eq. 7.2 (with respect to time) and eliminating the resulting time derivative of the momentum using Eq. 7.1:

dr2i dt2

D � 1 me

X j

Li;jrj: (7.3)

The presence in matrix Lij of nondiagonal elements indicates that the particle’s motion in the direction of any of the chosen axes X; Y , or Z is not independent of its motion in other directions. In layman’s terms, it means that it is impossible to arrange for this particle to move purely in the direction of any of the axes. Nevertheless, solutions of these equations still can be presented in the standard time-harmonic form ri D ai exp .i!t/ with amplitudes ai and frequency ! obeying equations:

1

me

X j

Li;jaj D !2ai; (7.4)

which is an eigenvalue equation for the matrix Li;j=me. It is obvious that this is symmetric (Li;j D L;j;i), real-valued, and, therefore, Hermitian matrix. Thus, based on the eigenvalue theorems discussed in Sect. 3.3.1, this matrix is guaranteed to have real eigenvalues and corresponding orthogonal eigenvectors. The equation for the eigenvalues is found by requiring that Eq. 7.4 has nontrivial solutions:

det � me!

2ıi;j � Li;j � D 0

and in general has three solutions !2n , where n D 1; 2; 3. Substituting each of these frequencies back in Eq. 7.4, you can find amplitudes a.n/x ; a

.n/ y a

.n/ z , which

form corresponding eigenvectors. These eigenvectors are regular three-dimensional vectors defining three mutually orthogonal directions in space. Oscillations in each of these directions, called normal modes, are characterized by their unique frequencies !2n and can occur independently of each other. Indeed, these three vectors can be used as a new basis, which in this particular case amounts to introducing new coordinate axes along the directions of the normal modes. The matrix Li;j=me transformed to this basis becomes diagonal, and introducing notation �1; �2, and �3; to represent coordinates along these new directions, Eq. 7.3 will take a form of three independent differential equations:

d2�n dt2

D �!2n�n: (7.5)

7.1 One-Dimensional Harmonic Oscillator 203

One can also show (those interested in details are welcome to read any of many text- books on classical mechanics, or molecular oscillations, or a combination thereof) that Hamiltonian written in terms of these new coordinates and corresponding conjugated momentums �n takes the form

H D X

n

�2n 2mn

C mn!2n�2n

which is the sum of three independent one-dimensional Hamiltonians. The transition to this form is not so trivial, and the mass parameter mn does not have to coincide with the actual mass of the particle. Nevertheless, as long as �n and �n are a canonically conjugated pair characterized by the standard for the coordinate and momentum Poisson brackets, Eq. 3.5, we can treat them as such for all practical purposes, including quantization.

Thus, using the concept of normal modes, one can always reduce a prob- lem involving harmonic oscillations to a simple combination of one-dimensional problems. This is actually true even in the case involving oscillations of several particles such as multi-atom molecules. Therefore, using the one-dimensional model to describe harmonic oscillations is even more justified than the one-dimensional models described in the previous chapter. And so, the one-dimensional model of the quantum harmonic oscillator is what I am going to consider next.

7.1 One-Dimensional Harmonic Oscillator

7.1.1 Stationary States (Eigenvalues and Eigenvectors)

Classical mechanics of the one-dimensional harmonic oscillator is described by Hamiltonian:

H D p 2

2me C 1 2

me! 2x2; (7.6)

and respective Hamiltonian equations for momentum p and coordinate x are obtained by specializing Eqs. 7.1 and 7.2 to the one-dimensional situation:

dp

dt D �me!2x (7.7)

dx

dt D p

me (7.8)

where I replaced the corresponding diagonal element of matrix Li;j as Lxx � me!2. These equations are, of course, easy to solve, and the solution is well known:

204 7 Harmonic Oscillator Models

x D x0 cos!t C p0 me!

sin!t

p D �me!x0 sin!t C p0 cos!t; (7.9)

where x0 and p0 are initial values of the coordinate and momentum of the particle. Equation 7.9 describes a familiar harmonic time dependence, which can be presented in terms of amplitude A and initial phase ı:

x.t/ D A sin .!t C ı/ :

Both A and ı are determined by the initial conditions: the amplitude—by the total energy E of the oscillator, which, as you know, is a conserving quantity—and phase, by the ratio of the initial coordinate and momentum. Recalling that at the maximum displacement E takes entirely the form of the potential energy, you can write

1

2 me!

2A2 D E D p 2 0

2me C 1 2

me! 2x20 )

A D s

2E

me!2 D s

x20 C p20

m2e! 2 : (7.10)

The phase of the oscillator can be found by expanding sin .!t C ı/ D sin!t cos ıC cos!t sin ı and equating the resulting terms with their counterparts in Eq. 7.9. This yields

A cos ı D p0 me!

A sin ı D x0 and subsequently

tan ı D x0me! p0

:

Obviously, the motion of a harmonic oscillator is bounded with the maximum deviation from the equilibrium position given by its amplitude A, Eq. 7.10. The coordinate x becomes equal to A at two turning points, where the velocity of the oscillator and, respectively, its kinetic energy turn to zero. The relation between total, potential, and kinetic energies of the harmonic oscillator can be illustrated by a diagram shown in Fig. 7.1, where vertical lines show the turning points of the classical motion.

Even though you all have known the solution to the harmonic oscillator problem almost since the elementary school, you might find it useful to play with its Hamiltonian a bit more. Let me, for instance, factorize the Hamiltonian, taking advantage of its u2 C v2 form, which can be presented as .u C iv/ .u � iv/:

7.1 One-Dimensional Harmonic Oscillator 205

Fig. 7.1 Energy diagram for a classical oscillator. The horizontal line corresponds to its total energy E, and vertical dashed lines indicate the turning points and coordinates corresponding to two maximum displacements

H D

pp 2me

C i r

me 2 !x

pp 2me

� i r

me 2 !x

: (7.11)

Now, on a whim, I am going to compute the Poisson bracket involving these factors. Designating the first of them as u,

u D pp 2me

C i r

m

2 !x;

and the second one as u�,

u� D pp 2me

� i r

me 2 !x;

I find, using Eq. 3.4 for Poisson bracket,

fu; u�g D @u @x

@u�

@p � @u @p

@u�

@x D i!:

Using this result as a hint, I now introduce new variables:

b D � ip !

u D r

me!

2 x � i pp

2me! (7.12)

b� D 1p !

u� D �i r

me!

2 x C pp

2me! (7.13)

whose Poisson bracket, by design, of course, is

˚ b; b�

� D 1:

206 7 Harmonic Oscillator Models

This means that b and b� constitute a canonically conjugated pair (if you have already forgotten what I am talking about, check Sect. 3.1), with b playing the role of the coordinate and b� pretending to be the momentum. Computing bb� (do it!), you will see that the Hamiltonian can be presented as

H D i!bb�:

The corresponding Hamiltonian equations are

db

dt D @H @b�

D i!b (7.14)

db�

dt D �@H

@b D �i!b�: (7.15)

The advantage of these equations as compared to initial Eqs. 7.7 and 7.8 is that they are independent first-order differential equations, which can be easily solved:

b D b0ei!tI b� D b�0e�i!t: (7.16)

Initial coordinate and momentum can be expressed in terms of b and b� by inverting Eqs. 7.12 and 7.13, but I will leave it for you as an exercise.

Transition between pairs x; p and b; b� is an example of a so-called canonical transformation of variables, and the only reason I decided to bother you with it is that it paves a way to better understanding its quantum analog, which is of crucial importance. According to the quantization rules discussed in Sect. 3.3.2, transition from classical to quantum description consists in promoting classical variables to quantum operators, and the coordinate-momentum dyad plays a crucial role in the process, namely, because it is a canonical pair. The operators replacing classical variables are, to a large extent, defined by their commutation relations, and in the case of canonical pairs, the commutator is directly linked to the respective Poisson brackets, as I have already mentioned previously. In the case of the coordinate- momentum pair, the corresponding commutator is obtained from the Poisson bracket by multiplying it by i„: As well as any pair of variables characterized by the canonical Poisson bracket, which play the role similar to coordinate and momentum in classical mechanics, any quantum mechanical pair of operators with canonical commutator i„ will have properties similar to those of the coordinate and momentum. For instance, if I know that two Hermitian operators O� and O� have a commutator Œ O�; O�� D i„, I can without any doubts claim that the operator O� in the representation based on eigenvectors of O� is O�� D �i„@[email protected]� similar to the momentum operator in the coordinate representation. I have to emphasize, however, the requirement that the operators must be Hermitian. Therefore, if I were to promote b and b� to operators, it would not work, because they would not be Hermitian. Nevertheless, operators similar to b and b� (while not exactly like them) do play an important role in quantum theory (and not just for harmonic oscillators).

7.1 One-Dimensional Harmonic Oscillator 207

Now I am ready to get down to our main business and start developing quantum theory of harmonic oscillators. The goal is to develop the theory as far as possible without resorting to any particular representation for momentum and coordinate operators. Such an approach will produce the most general results, independent of a representation, offer important insights into the quantum properties of oscillators, and create a formal framework for extending this theory beyond pure mechanical harmonic oscillators.

I start by “factorizing” the quantum Hamiltonian in a way similar to factorization of the classic Hamiltonian in Eq. 7.11. However, to make sure that this factorization works for operators, I would like to review the origin of the identity:

u2 C v2 D .u C iv/ .u � iv/ :

Removing the parentheses on its right-hand side, I have

.u C iv/ .u � iv/ D u2 C v2 C ivu � iuv:

If u and v are regular variables, the last two terms in this expression cancel, but if they are non-commuting operators, it is quite obvious that the original factorization rule is no longer true and must be corrected:

Ou2 C Ov2 D .Ou C i Ov/ .Ou � i Ov/C i ŒOu; Ov� : (7.17) The order of the terms in the parentheses on the right-hand side of this expression can be changed, which will result in an alternative form of the identity:

Ou2 C Ov2 D .Ou � i Ov/ .Ou C i Ov/ � i ŒOu; Ov� : (7.18) Identifying Ou and Ov as

Ou D r

me 2 ! Ox (7.19)

Ov D Opp 2me

; (7.20)

I find

ŒOu; Ov� D 1 2 ! ŒOx; Op� D 1

2 i„!: (7.21)

Operators Ou and Ov have a dimension of penergy, while their commutator, pro- portional to „!, obviously has the dimension of energy. If I am not mistaken, I have already remarked that it is often quite beneficial to work with dimensionless quantities. Thus, taking a clue from Eqs. 7.17 and 7.18 and the experience gained working with classical Hamiltonian, I will try to generate dimensionless operators such as

208 7 Harmonic Oscillator Models

Oa D 1p„! .Ou C i Ov/ D r

me!

2„ Ox C i Opp

2me„! (7.22)

Oa� D 1p„! .Ou � i Ov/ D r

me!

2„ Ox � i Opp

2me„! : (7.23)

The commutator of these operators is

�Oa; Oa�� D �i r

me!

2„ 1p

2me„! ŒOx; Op�C i

r me!

2„ 1p

2me„! ŒOp; Ox� D

� i 2„ ŒOx; Op�C

i

2„ ŒOp; Ox� D 1:

Due to a special importance of this result, I will reproduce it as a separate numbered formula:

�Oa; Oa�� D 1: (7.24)

The operators Oa and Oa� are clearly not Hermitian: performing Hermitian conjugation of Eqs. 7.22 and 7.23, you can immediately see that they are actually Hermitian conjugates of each other, hence the notation Oa�. It will also be useful to express coordinate and momentum operators in terms of Oa and Oa�. Adding and subtracting Eqs. 7.22 and 7.23, I can invert these equations to get

Ox D s

„ 2!me

�Oa C Oa�� ; (7.25)

Op D i r

„!me 2

�Oa� � Oa� : (7.26)

Using the operator factorization identities 7.17 or 7.18 with Ou and Ov defined in Eqs. 7.19 and 7.20, I can derive two alternative forms of the Hamiltonian:

OH D „! 1

2 C Oa� Oa

D „!

�1 2

C OaOa� :

These two expressions differ by the order of the operators in it and by the sign in front of 1=2. Formally they are absolutely equivalent, and one can be reduced to another using commutation relation 7.24. However, from a practical point of view (and you will have to trust me on this for now), the first of these expressions is much more convenient to use than the other. Thus, in what follows, I will rely on the representation of the Hamiltonian in the form

7.1 One-Dimensional Harmonic Oscillator 209

OH D „! 1

2 C Oa� Oa

: (7.27)

Our first task is to find the eigenvalues and eigenvectors of this Hamiltonian, i.e., the stationary states of the harmonic oscillator. Since the classical motion in the harmonic potential is bound for all values of energy, it should be expected that the entire spectrum of the Hamiltonian is discrete so that yet unknown energy eigenvalues can be labeled by a discrete index as En and the respective eigenvectors as jEni:

OH jEni D En jEni : (7.28)

Since I am not allowed to use any particular representation for the coordinate and momentum operators, all what I have to go on with are the commutation relations. This invites me to use the same purely algebraic technique, which I successfully used previously when searching for eigenvalues of the operators of the angular momentum in Sect. 3.3.4. However, in the role of the angular momentum ladder operators OL˙, I am going to cast operators Oa and Oa�, which appear to have some similarities with OL˙: they are also non-Hermitian and are Hermitian conjugates of each other. You might remember that operators OL˙ applied to an eigenvector of operator OLz generate other eigenvectors with decreased or increased eigenvalue. Will you be surprised if it turns out that operators Oa and Oa� are doing the same to the eigenvectors of the harmonic oscillator? Probably not.

The first step is to note that eigenvectors of the Hamiltonian coincide with those of the operator ON D Oa� Oa, which is called the number operator and is obviously Hermitian. Indeed, once you rewrite the Hamiltonian as

OH D „! 1

2 C ON

;

this statement becomes pretty obvious. Moreover, you can immediately see that if �n is the eigenvalue of ON: ON jEni D �n jEni ; then

En D „! 1

2 C �n

: (7.29)

Therefore, I can focus my attention on finding the eigenvalues and eigenvectors of

the number operator ON. To this end, I first compute the commutator h ON; Oa

i (if you

want to know what prompted me to do so, the only excuse I can offer is that there isn’t much more for me to do, so why not do that?):

h ON; Oa i

D Oa� Oa2 � OaOa� Oa D �Oa� Oa � OaOa�� Oa D �Oa; (7.30)

210 7 Harmonic Oscillator Models

where I took advantage of Eq. 7.24. Carrying out Hermitian conjugation of this result, and remembering to change the order of the operators in their product after Hermitian conjugation, I immediately obtain

h ON; Oa� i

D Oa�: (7.31)

In the next step, I consider ON Oa jEni and use the commutation relation 7.30 to get ON Oa jEni D �Oa jEni C Oa ON jEni D

�n Oa jEni � Oa jEni D .�n � 1/ Oa jEni :

This result shows that Oa jEni is an eigenvector of ON with eigenvalue �n � 1, i.e., the operator Oa generates eigenvectors of ON with eigenvalues decreasing by one with each application of the operator. Not surprisingly, this operator is called lowering operator. The questions, which naturally pop up at this point, are how far down in energy one can go and how one knows when the bottom is reached. The answer to the first question is obvious—that energy eigenvalues of the harmonic oscillator can never be negative, and thus �n > �1=2. The second question is answered by recycling arguments that I have already used when discussing the angular momentum—the only way to reconcile the ability of Oa to keep decreasing �n every time it is applied and the requirement that there must exist a smallest �n is to impose on the eigenvector corresponding to this minimum value condition:

Oa jEmini D 0: (7.32)

Another useful relation is obtained by performing Hermitian conjugation of this equation:

hEminj Oa� D 0: (7.33)

Now you are going to appreciate the wisdom of writing the Hamiltonian in the form of Eq. 7.27 and of introducing operator ON. Indeed Eq. 7.32 used in ON jEmini givesON jEmini D 0, which means that the minimum value �min D 0, and Emin D „!=2. So, behold the power of the lowering operator—we found the bottom, the lowest possible energy of a harmonic oscillator, its ground state!

Just like in other examples, the lowest energy is not zero, which is, of course, the consequence of the uncertainty principle: zero energy would require that both kinetic and potential energies are equal to zero, which would mean that both coordinate and momentum operators would have certain values of zero, which is impossible. The ground state energy „!=2 is one of the clearest examples of the energy associated with so-called quantum fluctuations.

The contribution of these fluctuations to the energy can be quantified by computing the expectation values of Op2 and Ox2, which determine the quantum

7.1 One-Dimensional Harmonic Oscillator 211

uncertainties of the respective observables. Using Eqs. 7.25 and 7.26 that express operators Op and Ox in terms of operators Oa and Oa� in conjunction with Eqs. 7.32 and 7.33, you can immediately see that

hEminj Ox jEmini D hEminj Op jEmini D 0;

so that the uncertainties 4p and 4x are 4p D phOp2i and 4x D phOx2i. Squaring Eqs. 7.25 and 7.26, and computing these expectation values with state jEmini, you will get

hEminj Ox2 jEmini D „ 2me!

�hEminj Oa2 jEmini C hEminj Oa�2 jEmini C

hEminj Oa� Oa jEmini C hEminj OaOa� jEmini � :

The first three terms in this expression vanish, thanks to Eqs. 7.32 and 7.33. However, the last term requires some more efforts because the order of operators Oa and Oa� in it is “wrong” in the sense that it is not conducive to the immediate application of Eqs. 7.32 and 7.33. The situation, however, can be quite easily rectified by using the commutation relations 7.24 to change this order and rewrite this term as

hEminj OaOa� jEmini D hEminj 1C Oa� Oa jEmini D 1;

where I, as usual, assumed that whatever the state vector jEmini is, it is normalized. Thus, finally, I find

hEminj Ox2 jEmini D „ 2me!

: (7.34)

Similarly,

hEminj Op2 jEmini D �me!„ 2

�hEminj Oa2 jEmini C hEminj Oa�2 jEmini �

hEminj Oa� Oa jEmini � hEminj OaOa� jEmini � D

me!„ 2

hEminj OaOa� jEmini D me!„ 2

: (7.35)

Using Eqs. 7.34 and 7.35 in expressions for kinetic and potential energies, Op2=2me and me!2x2=2, I immediately find that the ground state expectation values

˝Op2˛ =2me and me!2

˝Ox2˛ =2 are both equal to „!=4. Isn’t it remarkable that while the ground state of harmonic oscillator is characterized by a certain value of energy „!=2, it is formed by two fluctuating quantities, kinetic and potential energies, contributing equal amounts? One can actually see here a certain analogy with classical harmonic

212 7 Harmonic Oscillator Models

oscillator, whose energy, while being time independent, includes contributions from kinetic and potential energies, whose time dependencies totally compensate each other yielding a constant sum.

OK, by finding the energy of the ground state, I took you down to the very bottom of the energy valley. Now it is time to climb back up, and we are going to do it with the assistance of . . . wait for it. . . , of course, the operator Oa�! Actually, there is not much surprise or suspense here because this is exactly what happened with angular momentum operators: we used OL� to find the lowest eigenvalue and operator OLC to move up from there. My next step is pretty obvious now—consider ON Oa� jEni:

ON Oa� jEni D Oa� jEni C Oa� ON jEni D �n Oa� jEni C Oa� jEni D .�n C 1/ Oa� jEni

where this time I used commutation relation from Eq. 7.31. So, as expected, Oa� jEni is an eigenvector of the number operator with eigenvalue �n C 1, i.e., operator Oa� does generate eigenvectors with eigenvalues increasing by one for each application of the operator. Starting with the ground state, for which �min D 0, operator Oa� will generate eigenvectors with eigenvalues of ON equal to 1; 2; 3 � � � . In other words, the eigenvalues of the number operator are all natural numbers n starting with 0, which make energy levels of quantum harmonic oscillator, according to Eq. 7.29, equal to

En D „! 1

2 C n

; n D 0; 1; 2 � � � : (7.36)

What is left for us now is to find the corresponding eigenvectors, for which, from now on, I will use the simplified notation jni. All what I know at this point is that if jni is an eigenvector corresponding to the eigenvalue of the number operator n, then Oa� jni is an eigenvector corresponding to the eigenvalue n C 1. But I cannot guarantee that this new eigenvector will be normalized even if jni is. Therefore, reserving the bra and ket notation only for normalized vectors, the best I can write for now is

Oa� jni D cn jn C 1i ; (7.37)

where jn C 1i is assumed normalized and cn is yet an unknown normalization factor. Again, I cannot help but remind you that we encountered exactly the same situation

when discussing eigenvectors of OL2: To find cn I, first, write down a Hermitian conjugated version of Eq. 7.37:

hnj Oa D c�n hn C 1j : (7.38)

Then, multiplying left-hand and right-hand sides of Eqs. 7.37 and 7.38, I get

hnj OaOa� jni D jcnj2 hn C 1j n C 1i :

7.1 One-Dimensional Harmonic Oscillator 213

Using commutation relation 7.24 and taking into account that all vectors are now assumed normalized, I have

hnj ON C 1 jni D jcnj2 ) jcnj2 D n C 1:

Taking advantage of the freedom in the choice of the phase of the normalization factor, I choose cn to be real positive. Now I have the rule for generating new normalized eigenvectors:

jn C 1i D 1p n C 1 Oa

� jni :

Applying this rule sequentially starting with the ground state, I end up with the following expression for an arbitrary eigenvector jni:

jni D 1p nŠ

�Oa��n j0i ; (7.39)

where j0i stands for the eigenvector corresponding to the ground state. One can also show that

Oa jni D pn jn � 1i ; (7.40)

but I will leave a proof of this relation as an exercise. Equation 7.39 relates eigenvectors describing excited stationary states of the

oscillator to its ground state. The latter, however, might appear to you to be undetermined, which is true if by “determining” it you mean expressing it in terms of some known vectors or functions. However, for most purposes, all information that you need about the ground state is contained in Eq. 7.32, and in this sense, this equation is the definition of the ground state. You can use it to find answers to any specific question pertaining to this state. For instance, if you are interested in a function representing this state in coordinate representation, you can use the coordinate representation of the momentum and coordinate operators to turn Eq. 7.32 into an easy-to-solve differential equation for '0.x/ � hxj Emini:

0 @ r

me!

2„ x C s

„ 2me!

d

dx

1 A'0.x/ D 0 )

d'0.x/

dx D �me!„ x'0.x/ )

'0 D C exp

� x 2

2�2

: (7.41)

214 7 Harmonic Oscillator Models

Parameter � appearing in this equation is defined as

� D s

„ me!

(7.42)

and has the dimension of length. It specifies the characteristic scale of the spatial dependence of the wave function: for x � � the wave function is almost constant, while for x � its behavior crosses over to a steep descent. It is easy to see that this parameter characterizes a transition between classically allowed and classically forbidden regions of coordinates for the harmonic oscillator. Indeed, the substitution of quantum ground state energy E D „!=2 to Eq. 7.10 for the amplitude A of classical oscillator yields A D � , which means that for the ground state of the oscillator x < � corresponds to the classically allowed region, and the region x > � is classically forbidden.

Integration constant C in Eq. 7.41 is found from the normalization condition:

C2 1̂

�1 exp

� x

2

�2

dx D C2�

1̂

�1 exp

��Qx2� dQx D C2p�� D 1

where I computed the integral by introducing a dimensionless variable Qx D x=� and using a known value of the Gaussian integral

´ 1 �1 exp

��y2� dy D p� . Thus, the normalized version of the oscillator ground state wave function becomes

'0 D 1pp ��

exp

� x

2

2�2

: (7.43)

Having found the normalized ground state wave function in the coordinate rep- resentation, I can now use the raising operator (also rewritten in the coordinate representation) to generate wave functions representing an arbitrary stationary state of the Hamiltonian:

'n.x/ D 1p 2nnŠ�

p �

Qx � d

dQx n

exp

� Qx

2

2

� :

Here I used the coordinate representation for the raising operator expressed in terms of dimensionless variable Qx:

Oa� D r

me!

2„ x � „p 2me„!

d

dx D

xp 2�

� �p 2

d

dx D 1p

2

Qx � d

dQx

7.1 One-Dimensional Harmonic Oscillator 215

substituted in Eq. 7.39. You can easily convince yourselves that expression

Qx � d

dQx n

exp

� Qx

2

2

�

generates polynomials multiplied by an exponential function exp ��Qx2=2� : Pulling

out this exponential factor, you end up with so-called Hermite polynomials Hn. Qx/ defined as

Hn. Qx/ D exp Qx2 2

Qx � d

dQx n

exp

� Qx

2

2

�

so that the oscillator’s wave function takes the form

'n.x/ D 1p 2nnŠ�

p �

exp

� Qx

2

2

Hn. Qx/: (7.44)

Hermitian polynomials are well known in mathematical physics and can be computed from the following somewhat simpler expression:

Hn. Qx/ D .�1/n eQx2 d n

dQxn �

e�Qx2 � : (7.45)

The properties of these polynomials are well documented (google it!), so I will only emphasize one point: these polynomials and, therefore, the entire wave function have a definite parity—it is even for n D 0; 2; 4 � � � , and it is odd for n D 1; 3; 5 � � � . Obviously this fact is the result of the symmetry of the harmonic oscillator potential with respect to inversion and is an agreement with our previous discussions of the connection between this symmetry and the parity of the quantum states. Figure 7.2 presents graphs of wave functions representing states with n D 0; 1; 2; 3, from which you can see that another general rule is also fulfilled here: the number of zeroes of the wave function coincides with the number of the respective energy level n. Note that n is counted here starting from zero; therefore, the number of zeroes of the wave function is n instead of n � 1.

Coordinate representation is, obviously, not the only possible way to present eigenvectors of the harmonic oscillator. As a second example, I want to discuss a representation based on eigenvectors of the Hamiltonian jni. The eigenvectors themselves in this representation are presented, as all basis vectors, by columns with a single entry, equal to unity, in the row corresponding to the number of the respective basis vector. The Hamiltonian in this basis is presented by a diagonal matrix Hnm D Enınm, where En are energy eigenvalues given by Eq. 7.36. Less trivial is the representation of coordinate and momentum operators, and to find it I must first compute the matrix elements of the lowering (or raising—does not really matter) operator, amn D hmj Oa jni:

amn D hmj Oa jni D p

n hmj n � 1i D pnım;n�1;

216 7 Harmonic Oscillator Models

Fig. 7.2 Wave functions representing states of harmonic oscillators with n D 0 (upper left graph), n D 1 (upper right graph), n D 2 (lower left graph), and n D 3 (lower right graph)

where I used Eq. 7.40 and the fact that all eigenvectors jni are orthonormal. To visualize this matrix correctly, it is important to remember that index n in Eq. 7.40 starts counting from zero, and it is convenient to keep it this way when numerating matrix elements. In this case the first row is given by a0n, second by a1;n, and so on. Respectively, the first column is given by am0. Non-zero elements in matrix amn are characterized by column index exceeding the respective row index by one, i.e., a0;1; a1;2, etc.—they go parallel to the main diagonal but one element above it:

amn D

2 6666664

0 p 1 0 0 � � �

0 0 p 2 0 � � �

0 0 0 p 3 � � �

::: :::

::: : : :

:::

0 0 0 0 : : :

3 7777775 : (7.46)

The matrix for the raising operator

a�mn D hmj Oa� jni D p

n C 1 hmj n C 1i D pn C 1ım;nC1

7.1 One-Dimensional Harmonic Oscillator 217

is obtained from Eq. 7.46 by simple transposition:

a�mn D

2 6666664

0 0 0 0 � � �p 1 0 0 0 � � � 0

p 2 0 0 � � �

::: ::: : : :

: : : :::

0 0 0 0 : : :

3 7777775 : (7.47)

Obtaining matrices for coordinate and momentum operators is now as easy as adding two matrices. Using Eqs. 7.25 and 7.26, I find

xmn D s

„ 2me!

2 6666664

0 p 1 0 0 � � �p

1 0 p 2 0 � � �

0 p 2 0

p 3 � � �

::: :::

::: : : :

:::

0 0 0 0 : : :

3 7777775

(7.48)

pmn D i r

me„! 2

2 6666664

0 �p1 0 0 � � �p 1 0 �p2 0 � � � 0

p 2 0 �p3 � � �

::: :::

::: : : :

:::

0 0 0 0 : : :

3 7777775 : (7.49)

Both matrices are obviously Hermitian, but for the matrix representing the momen- tum operator, one must remember to do complex conjugation in addition to matrix transposition.

7.1.2 Dynamics of Quantum Harmonic Oscillator

When talking (or thinking) about a harmonic oscillator, we are intuitively look- ing for a quantity that changes periodically with time—oscillates. However, the stationary states, which I presented to you in the preceding section, are not very helpful in satisfying our intuitive subconscious desire to see a pendulum or at least something oscillating. Stationary states even though they have non-zero energies associated with them do not describe any dynamics and any physically relevant time dependence. Any expectation values computed with stationary states are time- independent, and those for coordinate and momentum are zeroes, not only for the ground state but for any stationary state. This is, of course, obvious from the coordinate and momentum matrices presented in Eqs. 7.48 and 7.49, but one can

218 7 Harmonic Oscillator Models

also make a symmetry-based argument explaining this result. Even though this is a detour from the main goal of this section, I will take it because symmetry-based arguments are important in many areas of quantum mechanics and also because they are cool.

The Hamiltonian of the harmonic oscillator is invariant with respect to the inversion operator O… (see Sect. 6.2.1), and, therefore, its eigenvectors have a definite parity, as it was already mentioned. Any expectation value involves a bra and ket pair of them, and, therefore, either they are odd or even, their overall contribution is invariant with respect to O… (a product of two odd functions is even). At the same time, coordinate and momentum operators are odd with respect to parity transformation: O…�1 Ox O… D �Ox, O…�1 Op O… D �Op; as it was shown in Sect. 5.1.2. Thus, on one hand, expectation values are supposed to change sign upon inversion, but on the other hand, they must not because they represent a property of the system invariant with respect to inversion. Thus, ponder this: you have a situation when a quantity must simultaneously change its sign while remaining the same. Clearly, there is only one quantity capable of this Houdini trick, and it is the great invention of Hindu mathematicians—the zero.

Now, back to the main topics. It is clear that the only way a quantum harmonic oscillator can actually oscillate is by being in a nonstationary state. We have dis- cussed two approaches to dealing with nonstationary phenomena—the Schrödinger picture (operators are time-independent, state vectors are time-dependent) and the Heisenberg picture (operators depend on time, and state vectors do not). I will treat the dynamics of the harmonic oscillator using both pictures beginning with the Heisenberg approach.

Heisenberg equations 4.24 can be derived for any operator, and in Sect. 4.2 I did that for the momentum and coordinate operators. Equation 4.33 provides you with the complete solution of the respective Heisenberg equations and essentially with all what you might need to describe the dynamics of any experimentally relevant quantity. However, I would like to revisit the problem of finding time- dependent position and momentum operators, but this time I will do it by solving the Heisenberg equations for lowering and raising operators. The corresponding equations are

d OaH dt

D � i„ h OaH; OH

i

d Oa�H dt

D � i„ h Oa�H; OH

i :

The expression for the Hamiltonian in terms of Heisenberg operators OaH , Oa�H is the same as in terms of Schrödinger operators:

e i „

OHt OHe� i„ OHt D OH D „! �

e i „

OHt Oa�e� i„ OHte i„ OHt Oae �i„ OH �

„! 1

2 C Oa�H OaH

;

7.1 One-Dimensional Harmonic Oscillator 219

and, therefore, all the commutation relations, which we calculated for Schrödinger operators, remain the same. In particular, using Eqs. 7.30 and 7.31, I can findh OaH; OH

i D „!

h OaH; ON

i D „! OaH , and

h Oa�H; OH

i D „!

h Oa�H; ON

i D �„! Oa�H .

Substituting it in the Heisenberg equations, I obtain the following nice-looking equations:

d OaH dt

D � i! OaH (7.50)

d Oa�H dt

D i! Oa�H : (7.51)

Unlike equations for the momentum and coordinate operators, Eqs. 7.50 and 7.51 are not coupled, so they can be solved independently—in that they have a striking resemblance to classical Eqs. 7.14 and 7.13. Solutions to these equations are easy to write:

OaH D Oae�i!tI Oa�H D Oa�ei!t; (7.52)

where Oa and Oa� are Schrödinger operators that play the role of initial conditions for the Heisenberg equations. Equations 7.25 and 7.26 are obviously valid for Heisenberg operators as well so that one can obtain for time-dependent coordinate and momentum operators:

OxH D s

„ 2me!

�Oae�i!t C Oa�ei!t� (7.53)

OpH D i r

me„! 2

�Oa�ei!t � Oae�i!t� : (7.54)

Using the Euler relation for the exponential functions, I can rewrite this result in the form previously derived in Eq. 4.33:

OxH D s

„ 2me!

��Oa C Oa�� cos!t C i �Oa� � Oa� sin!t� (7.55)

OpH D i r

me„! 2

��Oa� � Oa� cos!t C i �Oa C Oa�� sin!t� ; (7.56)

which agrees with Eq. 4.33 after one recognizes that at t D 0 these equations reproduce coordinate and momentum operators in the Schrödinger representation. Either of Eqs. 7.53–7.56 can be used, for instance, to compute the expectation values of coordinate and momentum for an arbitrary initial state j�0i. This task can be facilitated by using the basis of the eigenvectors jni to represent this state:

220 7 Harmonic Oscillator Models

j�0i D 1X

nD0 cn jni : (7.57)

It is a bit more convenient to carry out these calculations using the exponential form of time dependence as in Eqs. 7.53 and 7.54:

hxi D s

„ 2me!

" e�i!t

1X n;mD0

c�mcn hmj Oa jni C ei!t 1X

n;mD0 c�mcn hmj Oa� jni

# D

s „

2me!

" e�i!t

1X n;mD0

c�mcn p

nım;n�1 C ei!t 1X

n;mD0 c�mcn

p n C 1ım;nC1

# D

s „

2me!

" e�i!t

1X mD0

c�mcmC1 p

m C 1C ei!t 1X

mD0 c�mC1cm

p m C 1

# ; (7.58)

where I used previously derived matrix elements for the lowering and raising oper- ators. Now, we are getting something familiar: the expectation value of coordinate does indeed oscillate with the frequency of the harmonic oscillator !, and what is interesting, this behavior does not depend on the actual initial state, as long as it has contributions from at least two adjacent stationary states so that both cm and cmC1 are different from zeroes. This requirement, of course, excludes initial stationary states, which would have only one nonvanishing coefficient cm, as well as nonstationary states with a definite parity, which would contain only coefficients cm with either odd or even m. With a bit of imagination, you can recognize in Eq. 7.58 the typical for a classical harmonic oscillator behavior which can be described as

hxi D A cos .!t C / ; (7.59)

where the amplitude and phase of the oscillations are determined by the initial conditions (as they also are in the classical case):

A D s

„ 2me!

ˇ̌ ˇ̌ ˇ

1X mD0

c�mcmC1 p

m C 1 ˇ̌ ˇ̌ ˇ I

D arctan Im �P1

mD0 c�mC1cm p

m C 1�

Re �P1

mD0 c�mC1cm p

m C 1� : (7.60)

Similar calculations for the momentum operator produce

h pi D i r

me„! 2

" ei!t

1X mD0

c�mC1cm p

m C 1 � e�i!t 1X

mD0 c�mcmC1

p m C 1

# ;

(7.61)

7.1 One-Dimensional Harmonic Oscillator 221

which can be rewritten using the same amplitude and phase as

h pi D �me!A sin .!t C / (7.62)

in full agreement with the Ehrenfest theorem, Eq. 4.17. Before shifting attention to the Schrödinger picture, let me consider a few more

examples of the application of the Heisenberg equations.

Example 18 (Uncertainties of Coordinate and Momentum of a Harmonic Oscilla- tor) Assume that the harmonic oscillator is initially in a state described by an equal superposition of its ground and the first excited states:

j˛0i D 1p 2 .j0i C j1i/ :

Compute uncertainties of the coordinate and momentum operators at an arbitrary time t and demonstrate, using the Heisenberg picture, that the uncertainty relation is fulfilled at all times.

Using Eqs. 7.59, 7.62, and 7.60 with c0 D c1 D 1= p 2 and cm D 0 for m > 1, I

find for the expectation values

hxi D 1 2

s „

2me! cos!t

h pi D �1 2

r „!me 2

sin!t:

To find the uncertainties, I first have to compute ˝ p2 ˛

and ˝ x2 ˛ . I begin by computing

Ox2 D „ 2me!

�Oae�i!t C Oa�ei!t�2 D

„ 2me!

� Oa2e�2i!t C OaOa� C Oa� Oa C �Oa��2 e2i!t

� ;

Op2 D �me„! 2

�Oae�i!t � Oa�ei!t�2 D

�me„! 2

� Oa2e�2i!t � OaOa� � Oa� Oa C �Oa��2 e2i!t

� :

Now, remembering that Oa j0i D 0, Oa j1i D j0i, Oa j2i D p2 j1i, Oa� j0i D j1i, Oa� j1i D p2 j2i, Oa� j2i D p3 j3i, I get

Ox2 j˛0i D „ 2 p 2me!

� j0i C p2e2i!t j2i C 2 j1i C j1i C p6e2i!t j3i

� ;

Op2 j˛0i D �me„! 2 p 2

� � j0i C p2e2i!t j2i � 2 j1i � j1i C p6e2i!t j3i

� ;

222 7 Harmonic Oscillator Models

and, finally,

h˛0j Ox2 j˛0i D „ 4me!

.1C 3/ D „ me!

h˛0j Op2 j˛0i D me„! 4

.1C 3/ D me„!:

Now I can find the uncertainties:

4x D q

hOx2i � hOxi2 D s

„ me!

1 � 1

8 cos2 !t

4p D q

hOp2i � hOpi2 D s

me„! 1 � 1

8 sin2 !t

4x4p D „ 2 p 2

r 7C 1

32 sin2 2!t > 0:93„

in agreement with the uncertainty principle. There also exists an alternative approach to computing time-dependent averages

of various observables using the Heisenberg picture, which allows to establish their dependence of time in a more generic way. To develop such an approach, let me first rewrite Eqs. 7.55 and 7.56 using Eqs. 7.25 and 7.26 for Schrödinger versions of the coordinate and momentum operators, which I will designate here as Ox0 and Op0 to emphasize the fact that they serve as initial values for the Heisenberg equations:

OxH D Ox0 cos!t C 1 me!

Op0 sin!t (7.63)

OpH D Op0 cos!t � me! Ox0 sin!t: (7.64)

Now, let’s say I want to compute the uncertainty of the coordinate for an arbitrary state j˛i. The expectation values of the coordinate and momentum in this state, using Eqs. 7.63 and 7.64, can be written as

hxi D hOx0i cos!t C 1 me!

hOp0i sin!t

h pi D hOp0i cos!t � me! hOx0i sin!t;

where hOx0i and hOp0i are time-independent “Schrödinger” expectation values that can be computed for a given state using any of the representations for the Schrödinger coordinate and momentum operators. Similarly, I can find for the expectation values of the squared operators

7.1 One-Dimensional Harmonic Oscillator 223

˝Ox2˛ D ˝Ox20 ˛ cos2 !t C 1

2me! .hOx0 Op0i C hOp0 Ox0i/ sin 2!t C 1

m2e! 2

˝Op20 ˛ sin2 !t

˝Op2˛ D ˝Op20 ˛ cos2 !t � 1

2 me! .hOx0 Op0i C hOp0 Ox0i/ sin 2!t C m2e!2

˝Ox20 ˛ sin2 !t

which will yield the following for the uncertainties:

.4x/2 D .4Ox0/2 cos2 !t C 1 m2e!

2 .4Op0/2 sin2 !tC

1

2me! Œ.hOx0 Op0i C hOp0 Ox0i � 2 hOx0i hOp0i/� sin 2!t

.4p/2 D .4Op0/2 cos2 !t C m2e!2 .4Ox0/2 sin2 !t� 1

2 me! Œ.hOx0 Op0i C hOp0 Ox0i � 2 hOx0i hOp0i/� sin 2!t:

I already mentioned it once, but it is worth emphasizing again: all expectation values in this expression refer to Schrödinger operators and can be computed using any of the representations for the latter. Let me illustrate this point by considering the following example.

Example 19 (Harmonic Oscillator with Shifted Minimum of the Potential) Consider a harmonic oscillator with mass me and frequency ! in the ground state. Suddenly, without disruption of the oscillator’s state, the minimum of the potential shifts by d along the axes of oscillations and the stiffness of the potential changes such that it is now characterized by a new classical frequency �. Find the expectation value and uncertainty of coordinate and momentum of the electron in the potential with the new position of its minimum.

It is convenient to solve this problem using coordinate representation for the initial state and for the Schrödinger operators Ox and Op. First of all, let’s agree to place the origin of the X-axis at the new position of the minimum. Then, the initial wave function, which is the ground state wave function of the oscillator with potential in the original position, is

0.x/ D �me! �„

�1=4 exp

� �me! 2„ .x C d/

2 � ;

where x is counted from the new position of the potential. The expectation values of the Schrödinger operators hOx0i and hOp0i are

hOx0i D r

me!

�„

1̂

�1 x exp

� �me!„ .x C d/

2 �

D

224 7 Harmonic Oscillator Models

r me!

�„

1̂

�1 .x � d/ exp

� �me!„ x

2 �

D �d

where I made a substitution of variables and took into account that the wave function of the initial state is even. Similarly, I can find

hOp0i D �i„ r

me!

�„

1̂

�1 exp

� �me! 2„ .x C d/

2 �

�

d

dx

h exp

� �me! 2„ .x C d/

2 �i

dx D

i„ r

me!

�„ me!

„

1̂

�1 exp

� �me!„ .x C d/

2 � .x C d/ dx D 0:

Thus, I have for the time-dependent expectation values

hxi D �d cos�t h pi D me�d sin�t:

(Obviously, the dynamics of raising and lowering operators is defined by new frequency �.) In order to find the respective uncertainties, I need to compute 4Ox0, 4Op0, and hOx0 Op0i. The uncertainties of the regular Schrödinger coordinate and momentum operators do not depend on the position of the potential minimum with respect to the origin of the coordinate axes, so I can simply recycle the results from Eqs. 7.34 and 7.35:

4Ox0 D s

„ 2me!

I 4Op0 D r

„me! 2

:

For the last expectation value, hOx0 Op0i, I will actually have to do some work:

hOx0 Op0i D �i„ �me! �„

�1=2 � 1̂

�1 x exp

� �me! 2„ .x C d/

2 � d

dx

h exp

� �me! 2„ .x C d/

2 �i

dx D

i„ �me! �„

�1=2 �me! „ � 1̂

�1 x .x C d/ exp

� �me!„ .x C d/

2 �

dx D

7.1 One-Dimensional Harmonic Oscillator 225

i„ �me! �„

�1=2 �me! „ � 1̂

�1 x .x � d/ exp

� �me!„ x

2 �

dx

D i„ �me!

„ � „

2me!

D i„ 2 :

In the last line of this expression, I took into account that the integral with the linear in x factor vanishes because of the oddness of the integrand, while the integral containing x2 together with the normalization factor of the wave function reproduces the uncertainty of the coordinate .4Ox0/2. If you are spooked by the imaginary result here, you shouldn’t. Operator Ox0 Op0 is not Hermitian, and its expectation value does not have to be real. To complete this calculation, I would have to compute hOp0 Ox0i, but I will save us some time and use the canonical commutation relation ŒOx; Op� D i„ to find

hOx0 Op0i C hOp0 Ox0i D 2 hOx0 Op0i � i„ D 0:

Oops, so much efforts to get zero in the end? Feeling disappointed and a bit cheated? Well, you should be, because we could have guessed that the answer here is zero without any calculations. Indeed, the momentum operator contains imaginary unity in it, and with the wave function being completely real, this imaginary factor is not going anywhere. But, on the other hand, the result must be real because Ox0 Op0 C Op0 Ox0 is a Hermitian operator. So, the only conclusion a reasonable person can draw from this conundrum is that the result must be zero. Thus, we finally have for the time- dependent uncertainties:

.4x/2 D „ 2me!

cos2 �t C 1 m2e�

2

„me! 2

sin2 �t D „ 2me!

cos2 �t C !

2

�2 sin2 �t

.4p/2 D „me! 2

cos2 !t C m2e�2 „

2me! sin2 !t D „me!

2

cos2 �t C �

2

!2 sin2 �t

;

and for their product

.4x/2 .4p/2 D „ 2

4

cos4 �t C sin4 �t C

!2

�2 C �

2

!2

cos2 �t sin2 �t

� D

„2 4

cos4 �t C sin4 �t C 2 cos2 �t sin2 �t C

!2

�2 C �

2

!2 � 2

cos2 �t sin2 �t

� D

„2 4

1C

!2

�2 C �

2

!2 � 2

cos2 �t sin2 �t

� ;

where in the second line I added and subtracted term 2 cos2 �t sin2 �t and in the third line used identity cos4 �t C sin4 �t C 2 cos2 �t sin2 �t D

226 7 Harmonic Oscillator Models

� cos2 �t C sin2 �t�2 D 1. Function y C 1=y, which appears in the final result,

has a minimum at y D 1 (� D !), at which point the product of the uncertainties becomes „2=4. For all other relations between the two frequencies, the product of the uncertainties exceeds this value in full agreement with the uncertainty principle. It is interesting to note that as the uncertainties oscillate, their product returns to its minimum value at times tn D �n=.2�/.

In the Schrödinger picture, the dynamics of quantum systems is described by the time dependence of the vectors representing quantum states. For the initial state given by Eq. 7.57, the time-dependent state can be presented as (see Eq. 4.15)

j�.t/i D 1X

nD0 cne

�i!.nC1=2/ jni : (7.65)

Computing the expectation value of the coordinate with this state and using again the representation of the coordinate operator in terms of lowering and raising operators, I have

hxi D s

„ 2me!

" 1X nD0

1X mD0

c�mcnei!.m�n/t hmj Oa jni C

1X nD0

1X mD0

c�mcnei!.m�n/t hmj Oa� jni #

D s

„ 2me!

" 1X nD0

1X mD0

c�mcnei!.m�n/t p

nım;n�1 C

1X nD0

1X mD0

c�mcnei!.m�n/t p

n C 1ım;nC1 #

D s

„ 2me!

" e�i!t

1X mD0

c�mcmC1 p

m C 1C ei!t 1X

mD0 c�mC1cm

p m C 1

# (7.66)

in full agreement with Eq. 7.61 obtained using the Heisenberg representation. What is interesting about this result is that in the beginning of the computations, we had complex exponential functions with all frequencies ! .m � n/. However, after the matrix elements of the lowering and raising operators have been taken into account, only terms with a single frequency ! survived. In the Heisenberg approach, frequen- cies ! .m � n/ never appear because the properties of Oa and Oa� are incorporated from the very beginning at the level of the Heisenberg equations. Similar expressions can be easily derived for the expectation values of the momentum operator:

h pi D i r

„me! 2

" ei!t

1X mD0

c�mC1cm p

m C 1 � e�i!t 1X

mD0 c�mcmC1

p m C 1

# ;

7.1 One-Dimensional Harmonic Oscillator 227

while generic expressions for the uncertainties of the coordinate and momentum operators in the Schrödinger picture are much more cumbersome and are more difficult to derive. Thus, I will illustrate the derivation of the uncertainties for the time-dependent states in the Schrödinger picture with the same example 18, which was previously solved in the Heisenberg picture.

Example 20 (Uncertainties of the Coordinate and Momentum of the Quantum Harmonic Oscillator in the Schrödinger Picture) Let me remind you that we are dealing with a harmonic oscillator prepared in a state

j˛0i D 1p 2 .j0i C j1i/ ;

and we want to compute the uncertainties of the coordinate and momentum operators at an arbitrary time t using the Schrödinger picture.

Comparing the expression for the initial state with Eq. 7.66, expansion coeffi- cients cn in Eq. 7.65 can be identified as c0 D c1 D 1=

p 2while all other coefficients

vanish. Thus, the time-dependent state vector now becomes

j�.t/i D 1p 2

exp

�1 2 !t

j0i C exp

�3 2 !t

j1i � :

The expectation values are immediately found from Eq. 7.66 to be as before

hxi D 1 2

s „

2me! cos!tI

h pi D �1 2

r „!me 2

sin!t:

To find the uncertainties, I need

Ox2 D „ 2me!

� Oa2 C �Oa��2 C OaOa� C Oa� Oa

�

Op2 D �„!me 2

� Oa2 C �Oa��2 � OaOa� � Oa� Oa

� :

Using again the properties of the lowering and raising operators, I find

Ox2 j�.t/i D „ 2 p 2me!

p 2 exp

�1 2 !t

j2i C exp

�1 2 !t

j0i C

p 6 exp

�3 2 !t

j3i C 3 exp

�3 2 !t

j1i :

228 7 Harmonic Oscillator Models

Now, premultiplying this result by h�.t/j and using orthogonality of the eigenvec- tors, I find

˝Ox2˛ D „ me!

in complete agreement with the results obtained in the Heisenberg picture. I will leave computing of the result for the momentum operator to you.

7.2 Isotropic Three-Dimensional Harmonic Oscillator

Using the concept of normal coordinates, any three-dimensional (or even multi- particle) harmonic oscillator can be reduced to the collection of one-dimensional oscillators with total Hamiltonian being the sum of one-dimensional Hamiltonians. The spectrum of eigenvalues in this case is obtained by simply summing up the eigenvalues of each one-dimensional component, and the respective eigenvectors are obtained as direct product of one-dimensional eigenvectors. To illustrate this point, consider a Hamiltonian of the form

OH D Op 2 x

2mex C Op

2 y

2mey C Op

2 z

2mez C 1 2

� mex!

2 x Ox2 C mey!2y Oy2 C mez!2z Oz2

� (7.67)

D OHx C OHy C OHz:

I can define a state characterized by three quantum numbers ˇ̌ nx; ny; nz

˛ which can

be considered as a “product” of the one-dimensional eigenvectors defined in the previous section

ˇ̌ nx; ny; nz

˛ � jnxi ˇ̌ ny ˛ jnzi, where the last notation does not presume

any kind of actual “multiplication” but just serves as a reminder that the x-dependent part of the Hamiltonian 7.67 acts only on the jnxi portion of the eigenvector, the OHy acts only on

ˇ̌ ny ˛ , and so on. Thus, as a result, I have

� OHx C OHy C OHz �

jnxi ˇ̌ ny ˛ jnzi D

„!x

nx C 1

2

C „!y

ny C 1

2

C „!z

nz C 1

2

� jnxi

ˇ̌ ny ˛ jnzi ;

where nx;y;z independently take integer values starting from zero. The position representation of the eigenvectors is obtained as

'nx;ny;nz.x; y; z/ D hx; y; z ˇ̌ nx; ny; nz

˛ � hx jnxi h y ˇ̌ ny ˛ hz jnzi D

'nx.x/'ny. y/'nz.z/;

where each 'ni.ri/ is given by Eq. 7.44.

7.2 Isotropic Three-Dimensional Harmonic Oscillator 229

In the most general case, when the parameters in OHx, OHy, and OHz are all different, we end up with distinct eigenvalues characterized by three independent integers. The energy of the ground state is characterized by nx D ny D nz D 0 and is given by E0;0;0 D 12„

� !x C !y C !z

� .

If, however, all masses and all three frequencies are equal to each other so that the Hamiltonian becomes

OH D Op 2 x C Op2y C Op2z 2me

C 1 2

mex! 2 �Ox2 C Oy2 C Oz2� ; (7.68)

a new phenomenon emerges. The energy eigenvalues are now given by

Enx;ny;nz D „! 3

2 C nx C ny C nz

;

and it takes the same values for different eigenvectors as long as respective indexes obey condition n D nx Cny Cnz. In other words, the eigenvalues in this case become degenerate—several distinct vectors belong to the same eigenvalue. The number of degenerate eigenvectors is relatively easy to compute: for each n you can choose nx to be anything between 0 and n, and once nx is chosen, ny can be anything between 0 and n � nx, so there are n � nx C 1 choices. Once nx and ny are determined, the remaining quantum number nz becomes uniquely defined. Thus, the total number of choices of nx and ny for any given n can be found as

nxDnX nxD0

.n � nx C 1/ D .n C 1/ .n C 1/ � n.n C 1/=2 D .n C 1/.n C 2/=2:

This degeneracy can be easily traced to the symmetry of the system, which has emerged once I made the parameters of the oscillator independent of the direction.

7.2.1 Isotropic Oscillator in Spherical Coordinates

Even though we already know the solution to the problem of an isotropic harmonic oscillator, it is instructive to reconsider it by working in the position representation and using the spherical coordinate system instead of the Cartesian one. The position representation of the Hamiltonian in this case becomes

OH D � „ 2

2me r2 C 1

2 me!

2r2 D

� „ 2

2mer2 @

@r

r2 @

@r

C

OL2 2mer2

C 1 2

me! 2r2; (7.69)

230 7 Harmonic Oscillator Models

where in the second line I used Eq. 5.62 representing Laplacian operator in terms of the radial coordinate r and operator OL2. It is obvious that the Hamiltonian commutes with both OL2 and OLz so that the eigenvectors of the Hamiltonian in the position representation can be written as

nr ;l;m.r; �; '/ D Yml .�; '/Rnr ;l.r/: (7.70)

Substituting Eq. 7.70 into the time-independent Schrödinger equation

OH D E

with Hamiltonian given by Eq. 7.69, you can derive for the radial function Rnr ;l.r/:

� „ 2

2mer2 d

dr

r2 @Rnr ;l @r

C „

2l.l C 1/ 2mer2

Rnr ;l C 1

2 me!

2r2Rnr ;l D El;nr Rnr ;l: (7.71)

It is convenient to introduce an auxiliary function unr ;l.r/ D rRnr ;l, which, when inserted into the radial equation above, turns it into

� „ 2

2me

d2unr ;l dr2

C „ 2l.l C 1/ 2mer2

unr ;l C 1

2 me!

2r2unr ;l D El;nr unr ;l: (7.72)

Equation 7.72 looks exactly like a one-dimensional Schrödinger equation with effective potential:

Veff D „ 2l.l C 1/ 2mer2

C 1 2

me! 2r2:

The plot of this potential (Fig. 7.3) shows that it possesses a minimum

Vmineff D „! p

l.l C 1/

Fig. 7.3 The schematic of the effective potential for the radial Schrödinger equation for isotropic 3-D harmonic oscillator in arbitrary units

7.2 Isotropic Three-Dimensional Harmonic Oscillator 231

at

r2min D „

me!

p l.l C 1/:

(Of course, you do not need to plot this function to know that it has a minimum— just compute the derivative and find its zero.) For any given l, the allowed values of energy obeying inequality E > „!pl.l C 1/ correspond to the classical bound motion; thus all energy levels in this effective potential are discrete (which, of course, is nobody’s surprise, but still is a nice fact to know). States with l D 0 are described by the Schrödinger equation, which is the exact replica of the equation for the one-dimensional oscillator. You, however, should not rush to pull out of the drawer old dusty solutions of the one-dimensional problem (OK, not that old and dusty, but still). A huge difference with the purely one-dimensional case is the fact that the domain of the radial coordinate r is Œ0;1�, unlike the domain of a coordinate in the one-dimensional problem, which is Œ�1;1�. Consequently, the wave function unr ;l must obey a boundary condition at r D 0. Given that the actual radial function Rnr ;l must remain finite at r D 0, it is clear that unr ;l(0)=0. Now you can go ahead, brush the dust from the solutions of the one-dimensional harmonic oscillator problem, and see which fit this requirement. A bit of examination reveals that we have to throw out all even solutions with quantum numbers 0; 2; 4 � � � which do not satisfy the boundary condition at the origin. At the same time, all odd solutions, characterized by quantum numbers 1; 3; 5 � � � , satisfy both the Schrödinger equation and the newly minted boundary condition at r D 0, so they (restricted to the positive values of the coordinate) do represent eigenvectors of the isotropic oscillator with zero angular momentum.

Solving this problem with l > 0 requires a bit more work. To make it somewhat easier to follow, I will begin by introducing a dimensionless radial coordinate & D r=� , where � D p„=me! is the same length scale that was used in the one- dimensional problem. The Schrödinger equation rewritten in this variable becomes

�„! 2

d2unr ;l d&2

C „!l.l C 1/ 2&2

unr ;l C 1

2 „!&2unr ;l D El;nr unr ;l

d2unr ;l d&2

� l.l C 1/ &2

unr ;l � &2unr ;l C �l;nr unr ;l D 0 (7.73)

where I introduced dimensionless energy

�l;nr D 2El;nr=„!:

The resulting differential equation obviously does not have simple solutions expressible in terms of elementary functions. One of the approaches to solving it is to present a solution in the form of a power series

P cj& j and search for

unknown coefficients cj. In principle, knowing these coefficients is equivalent to

232 7 Harmonic Oscillator Models

knowing the entire function. Before attempting this approach, however, it would be wise to try to extract whatever information about the solution this equation might contain. For instance, you can ask about the solution’s behavior at very small and very large values of & . When & � 1, the main term in Eq. 7.73 is the angular momentum contribution to the effective potential. Neglecting all other terms, you end up with an equation

d2unr ;l d&2

� l.l C 1/ &2

unr ;l D 0; (7.74)

which has a simple power solution

unr ;l D A& lC1: (7.75)

You are welcome to plug it back in Eq. 7.74 and verify it by yourself. For those who think that I used magical divination to arrive at this result, I have disappointing news: Eq. 7.74 belongs to a well-known class of so-called homogeneous equations. This means that if I multiply & by an arbitrary constant factor �, the equation does not change (check it), with a consequence that if function u.&/ is the solution, so is function u .�&/. Such equations are solved by power functions u / &%, where power % is found by plugging this function into the equation.

In the limit of large & 1, the main contribution to Eq. 7.73 comes from the harmonic potential. We know from solving the one-dimensional problem that the respective wave functions contain an exponential term exp

��Qx2=2� for x direction and similar terms for two other coordinates. When multiplying all these wave functions together to obtain a three-dimensional wave function, these exponential terms turn into exp

��&2=2�; thus it is natural to expect that the radial function unr ;l will contain such a factor as well. To verify this assumption, I am going to substitute exp

��&2=2� into Eq. 7.73 and see if it will satisfy the equation, at least in the limit & ! 1. Neglecting all terms small compared to &2; I find

d2u

d&2 D �e�&2=2 C &2e�&2=2 � &2e�&2=2:

Substituting this result in Eq. 7.73, and neglecting all terms except of the harmonic potential, I find that this function is, indeed, an asymptotically accurate solution of this equation. I want you to really appreciate this result: in order to reproduce exponential decay of the wave function, which, by the way, almost ensures its normalizability, using a power series, we would have to keep track of all the infinite number of terms in it, which is quite difficult if not outright impossible. By pulling out this exponential term as well as the power law for small & , you might entertain some hope that the remaining dependence on & is simple enough to be dug out.

Thus, my next step is to present function unr ;l .&/ as

ul;nr .&/ D A& lC1 exp ��&2=2� vl;nr .&/ (7.76)

7.2 Isotropic Three-Dimensional Harmonic Oscillator 233

and derive a differential equation for the remaining function vnr ;l .&/. To this end, I first compute

d2unr ;l d&2

D d d&

� .l C 1/ & l exp ��&2=2� vnr ;l .&/�

& lC2 exp ��&2=2� vnr ;l .&/C & lC1 exp

��&2=2� dvnr ;l d&

� D

l.l C 1/& l�1 exp ��&2=2� vnr ;l .&/ � .l C 1/ & lC1 exp ��&2=2� vnr ;l .&/C

.l C 1/ & l exp ��&2=2� dvnr ;l d&

� .l C 2/ & lC1 exp ��&2=2� vnr ;l .&/C

& lC3 exp ��&2=2� vnr ;l .&/ � & lC2 exp

��&2=2� dvnr ;l d&

C

.l C 1/ & l exp ��&2=2� dvnr ;l d&

� & lC2 exp ��&2=2� dvnr ;l d&

C

& lC1 exp ��&2=2� d

2vnr ;l

d&2 D

exp ��&2=2� & l�1vnr .&/

� l.l C 1/ � &2.2l C 3/C &4�C

& l exp ��&2=2� dvnr

d&

� 2l C 2 � 2&2�C & lC1 exp ��&2=2� d

2vnr d&2

:

Frankly speaking, I did not have to torture you with these tedious calculations: such computational platforms as Mathematica or Maple work with symbolic expressions and can perform this computation faster and more reliably (and, yes, I did check my result against Mathematica’s). Substituting this expression to Eq. 7.73, I get (and here you are on your own, or you can try computer algebra to reproduce this result)

& dv2nr ;l d&2

C 2 �l C 1 � &2� dvnr;l d&

C &vnr;l .&/ .�l;nr � 2l � 3/ D 0: (7.77)

Now I can start solving this equation by presenting the unknown function vnr ;l .&/ as a power series and trying to find the corresponding coefficients:

vnr ;l .&/ D 1X

jD0 cj&

j: (7.78)

The goal is to plug this expression into Eq. 7.77 and collect coefficients in front of equal powers of & . First, I blindly substitute the series into Eq. 7.77 and separate all sums with different powers of & :

234 7 Harmonic Oscillator Models

1X jD0

cjj. j � 1/& j�1 C 2.l C 1/ 1X

jD0 cjj&

j�1 � 2 1X

jD0 cjj&

jC1C

.�l;nr � 2l � 3/ 1X

jD0 cj&

jC1 D 0:

Combining the first two and last two sums, I get

1X jD0

j Œ j � 1C 2l C 2� cj& j�1 C 1X

jD0 Œ�l;nr � 2l � 3 � 2j� cj& jC1 D 0:

Next I notice that in the first sum, contributions from terms with j D 0 vanish, so that this sum starts with j D 1. I can reset the count of the summation index back to zero by introducing new index k D j � 1, so that this sum becomes

1X kD0

ckC1 .k C 1/ .k C 2l C 2/& k:

Renaming k back to j (this is a dummy index, so you can call it whatever you want, it does not care), we rewrite the previous equation as

1X jD0 . j C 1/ Œ j C 2l C 2� cjC1& j C

1X jD0

Œ�l;nr � 2l � 3 � 2j� cj& jC1 D 0:

The first sum in this expression begins with &0 term multiplied by coefficient c1. The second sum, however, begins with linear in & term and does not contain &0 at all. To satisfy the equation, coefficients in front of each power of & must vanish independently of each other, so we have to set c1 D 0. This makes the first sum again to start with j D 1. Utilizing the same trick as before, I am replacing j with j C 1 while restarting count from new j D 0 again. The result is as follows:

1X jD0 . j C 2/ Œ j C 2l C 3� cjC2& jC1 C

1X jD0

Œ�l;nr � 2l � 3 � 2j� cj& jC1 D 0:

Now I can, finally, combine the two sums and equate the resulting coefficient in front of & jC1 to zero:

. j C 2/ Œ j C 2l C 3� cjC2 D Œ2l C 3C 2j � �l;nr � cj

7.2 Isotropic Three-Dimensional Harmonic Oscillator 235

or

cjC2 D 2l C 3C 2j � �l;nr . j C 2/ Œ j C 2l C 3�cj: (7.79)

This is a so-called recursion relation, which allows computing all expansion coefficients recursively starting with the first one. It is important to note that Eq. 7.79 connects only coefficients with indexes of the same parity: all coefficients with even indexes are expressed in terms of c0, and all coefficients with odd indexes are expressed in terms of c1: But, wait, have not we determined a few lines back that c1 D 0? Actually, we did determine that, and now, thanks to Eq. 7.79, I can establish that not only c1 but all coefficients with odd indexes are zeroes. So, it looks like I achieved the announced goal—finding all coefficients in the power series expansion of vnr ;l. Formally speaking, I did, indeed, but it is a bit too early to dance around the fire and celebrate. First, I still do not know what values of the dimensionless energy �l;nr correspond to the respective eigenvectors, and, second, I have to verify that the found solution is, indeed, normalizable. The last issue is not trivial because we are dealing with an infinite series here, so there are always questions about its convergence and the behavior of the function it represents. As I shall demonstrate now, both these questions are connected and will be answered together.

Whether a function is normalizable or not is determined by its behavior for large values of its argument. I pulled out an exponentially decreasing factor from the solution hoping that it would be sufficient to guarantee normalization, but to be sure I need to consider the behavior of vnr ;l at & ! 1. Any finite number of terms in the expansion 7.78 cannot overcome the exponentially decreasing factor exp

��&2=2�, so the anticipated danger can only come from the tail of the power series, i.e., from coefficients cj with j ! 1. In this limit the recursion relation 7.79 can be simplified to

cjC2 � 2 j

cj; (7.80)

which, when applied repeatedly, yields

c2j0C2N D 22N

2j0.2j0 C 2/ � � � .2j0 C 2N/c2j0 D 1

j0 . j0 C 1/ � � � . j0 C N/c2j0 :

When writing this expression, I explicitly took into account that there are only even indexes, which can be presented as 2j0C2k with the total number of recursive factors being 2N. Even though this expression is only valid for j0 1, I can extend it to all values of j0 because as I pointed out earlier, any finite number of terms in the power series would not affect its asymptotic behavior. That means that the large & behavior of the series in question is the same as that of the series:

1X jD0

&2j

jŠ D e&2 :

236 7 Harmonic Oscillator Models

Even after combining this result with exp ��&2=2� factor, which was pulled out

earlier, I still end up with function unr ;l .&/ behaving as exp � &2=2

� at infinity. What

a bummer! It is disappointing, but not really surprising: it is easy to check that exp

� &2=2

� is the second possible asymptotic solution of Eq. 7.73, which I choose

to discard because of its non-normalizable nature. Well, this is how it often is—you chase math out of the door, but it always comes back through the window to bite you. So, the question now is if there is anything I can do to save the normalizability of our solution. That light at the end of the tunnel will appear if you recall that any power series with a finite number of terms cannot overpower an exponentially decreasing function. Therefore, if I find a way to terminate the series at some finite number of terms, our conundrum will be resolved. To see how this is possible, let’s take another look at the recursion relation, Eq. 7.79. What if at some value of j, which I will call 2nr to emphasize its evenness, the numerator of this relation turns zero? If this were to happen, then coefficient cjmxC2 would vanish and vanquish all subsequent coefficients as well, so that instead of an infinite series, I will end up with a finite sum. This will surely guarantee the normalizability of the found solution. The condition for the numerator to vanish reads as

2l C 3C 4nr � �l;nr D 0

which is immediately recognizable as an equation for the dimensionless energy �l;nr ! While resolving the normalization problem, I just automatically solved finding the eigenvalue problem. Using

�l;nr D 3C 2.l C 2nr/

as well as the relation between �l;nr and actual energy eigenvalues, I obtain

El;nr D „! 3

2 C l C 2nr

:

Thus, for each l and nr, you have an energy value and a respective wave function

ul;nr .&/ D & lC1 exp ��&2=2�

2nrX jD0

cj& j (7.81)

where coefficients cj are given by Eq. 7.79. To get a better feeling for this result, consider a few special examples.

1. nr D 0. In this case the sum in Eq. 7.81 contains a single term c0, so the non- normalized wave function becomes

ul;0 .&/ D c0& lC1 exp ��&2=2�

with respective energy value El;0 D „! � 3 2

C l�.

7.2 Isotropic Three-Dimensional Harmonic Oscillator 237

2. nr D 1. Using Eq. 7.79 with �l;nr D 3C 2.l C 2/, I find for c2 (substituting j D 0 into Eq. 7.79):

c2 D 2l C 3 � .3C 2l C 4/ 2 Œ2l C 3� c0 D �

2

2l C 3c0

so that

ul;1 .&/ D c0& lC1 exp ��&2=2�

1 � 2&

2

2l C 3 :

Following this pattern you can compute the wave functions belonging to any eigenvalue. For higher energy eigenvalues, it would take more time and efforts, of course, but you can always give this task to a computer. Before finishing this section, I would like to note that the energy eigenvalues depend only on the sum l C 2nr rather than on each of these quantum numbers separately. It makes sense, therefore, to introduce a main quantum number n D lC2nr and use it to characterize energy values:

En D „! 3

2 C n

: (7.82)

Then, the radial wave functions will be labeled by indexes l and n with a requirement n � l D 2nr � 0, while the total wave function includes spherical harmonics and an additional index m. In actual physical variables, it becomes

n;l;m D 1 �

r

�

l exp

� r

2

2�2

n�lX jD0

cj

r

�

j Yml .�; '/ (7.83)

where I reintroduced radial function Rnr ;l D unr;;l=r. This function is not normalized until the value of coefficient c0 in its radial part is defined, but I am not going to bother you with that. Instead, I will compute the degree of degeneracy of an energy eigenvalue characterized by main number n, which is much more fun. Taking into account that for each l there are 2l C 1 possible values of m, and that l runs all the way down from n in increments of 2 (n � l must remain an even number), the total number of states with given n is

X .2l C 1/ D .n C 1/C n.n C 1/=2 D .n C 1/ .n C 2/ =2;

where the summation over l is carried with increments of 2. It is a nice feeling to realize that this expression for degeneracy agrees with the one obtained using Cartesian coordinates.

238 7 Harmonic Oscillator Models

The resulting expression for the wave function given by Eq. 7.83 is an alternative way to produce a position representation of the harmonic oscillator wave function and is quite remarkably different from the one obtained using Cartesian coordinates. One might wonder why it is at all possible to have such distinct ways to represent a same eigenvector. After all, isn’t a representation, once chosen, supposed to provide a unique way to describe a quantum state? The matter of fact is that it is, indeed, so only if a corresponding eigenvalue is non-degenerate. In the degenerate case, one can form an infinite number of the linear combinations of the eigenvectors, and any one of them will realize the same representation of the corresponding state. In the case of isotropic harmonic oscillator, it means that the wave functions expressed in spherical coordinates can be presented as linear combinations of their Cartesian counterparts and vice versa.

7.3 Quantization of Electromagnetic Field and Harmonic Oscillators

7.3.1 Electromagnetic Field as a Harmonic Oscillator

Even though the idea of photons—the quanta of electromagnetic field—was one of the first quantum ideas introduced into the conscience of physicists by Einstein in 1905,1 the full quantum description of electromagnetic field turned out to be a rather difficult problem. The first serious attempt in developing quantum electrodynamics was undertaken by Paul Dirac in his famous 1927 paper,2 which was just the beginning of a long and difficult path walked by too many brilliant physicists to be mentioned in this book. Here are just a few names of those who made critical theoretical contributions to this field: German-American Hans Bethe, Japanese Sin-Itiro Tomonaga, and Americans Julian Schwinger, Richard Feynman, and Freeman Dyson. Quantum electrodynamics is a difficult subject addressed in multiple specialized books and is beyond the scope of this text. Nevertheless, I would love to scratch a bit from the surface of this field and demonstrate how ideas developed in the course of studying the harmonic oscillator emerge in new and unexpected places.

1The irony is that an explanation of photoelectric effect did not require the quantization of light despite what you might have read or heard. All experimental data could have been explained treating light classically while describing electrons in metals by the Schrödinger equation. Fortunately Einstein did not have the Schrödinger equation in 1905 and couldn’t know that. The science does evolve in mysterious ways: Einstein’s erroneous idea about the photoelectric effect inspired de Broglie and Schrödinger and brought about the Schrödinger equation, which could have been used to disprove the idea. Compton’s effect, on the other hand, can indeed be considered as a proof of reality of photons. 2P.A.M. Dirac, The quantum theory of the emission and absorption of radiation. Proc. R. Soc. Lond. 114, 243 (1927).

7.3 Quantization of Electromagnetic Field and Harmonic Oscillators 239

To this end, I propose considering a toy model of electromagnetic field, in which the field is described by single components of electric and magnetic fields:

Ex D aE0.t/ sin kz (7.84)

By D �1 c

aB0.t/ cos kz (7.85)

where I introduced a normalization coefficient a to be defined later; extra factor 1=c, where c is the speed of light in vacuum, in the formula for the magnetic field, ensures that amplitudes E0 and B0 have the same dimension (you might remember from the introductory course on electromagnetism relation E D cB between electric and magnetic fields in a plane wave), and the negative sign is included for future convenience. The Maxwell equations for the electromagnetic field in this simplified case take the form

@Ex @z

D �@By @t

@By @z

D � 1 c2 @Ex @t :

Plugging in the expressions for electric and magnetic fields given by Eqs. 7.84 and 7.85, you will find that the spatial dependence chosen for the fields in these equations is indeed consistent with the Maxwell equations, which will be reduced to the system of ordinary differential equations:

dB0 dt

D !E0.t/ (7.86) dE0 dt

D �!B0.t/: (7.87)

Parameter ! appearing in these equations is defined as ! D ck. It is easy to see that amplitudes of both electric and magnetic fields obey the same differential equation as a harmonic oscillator. For instance, differentiating the first of these equations with respect to time and using the second equation to replace the time derivative of the electric field, you will get

d2B0 dt2

C !2B0 D 0:

Similar equation can be derived for E0. You can also notice that Eqs. 7.86 and 7.87 have some resemblance to the Hamiltonian equations of classical mechanics, and this may make you wonder if they can be derived from some kind of a Hamiltonian. If you are asking why on earth would I want to re-derive these equations from a Hamiltonian, you were not paying attention to the first 130 pages of the book. Hamiltonian formalism allows us to introduce canonical pairs of variables,

240 7 Harmonic Oscillator Models

which we can turn into operators obeying canonical commutation relations; thus a Hamiltonian formulation is the key to turning classical theory of electromagnetic field into the quantum one.

How would one go about introducing a Hamiltonian for the electromagnetic fields? Naturally, one starts by remembering that Hamiltonian is the energy of the system and that the energy of the electromagnetic field is given by

H D ˆ

V

d3r

1

2 "0E2 C 1

2�0 B2 ; (7.88)

where integration is carried over the entire space occupied by the field. However, if you attempt to directly compute this integral using Eqs. 7.84 and 7.85 for electric and magnetic fields, you will encounter a problem: the integral is infinite. This happens because the field occupies the entire infinite space and does not decrease with distance. To fix the problem, I introduce a large but finite region of volume V D LzSxy, where Lz is the linear dimension of this region in z direction and Sxy is the area of the limiting plane perpendicular to it, and assume that the field vanishes outside of this region. This trick is very popular in physics, and you will encounter it in different circumstances later in the book. It can be justified by noting that the notion of a field occupying the entire space is by itself quite artificial with no relation to reality. It is also natural to assume that the properties of the field far away from the region of actual interest should not affect any observable phenomena, so that we can choose them to be as convenient for us as possible.

With this in mind, I can write the integral in Eq. 7.88 as

H D a2Sxy 2 41 2 "0E20

Lˆ

0

dz sin2 kz C 1 2�0

B20 Lˆ

0

dz cos2 kz

3 5 D

1

4 a2"0SxyL

�E20 C B20 � ;

where I assumed that k satisfies condition kL D �n; n D 1; 2; � � � , making cos 2kz D 1 at both the lower and upper integration limits so that the respective terms cancel out. Also, at the last step, I made a substitution .�0"0/

�1 D c2. You might, of course, object to the artificial discretization of the wave number and imposition of the arbitrary conditions on the values of the electric and magnetic fields at z D L. So, what can I say in my defense? First, in the limit L ! 1, which I can make after everything is said and done, the discretization will disappear, and as you will see in a few short minutes, I will make the dependence on the volume which popped up in the last expression for the Hamiltonian, disappear as well. Second, I can invoke the same argument I just made when limiting the field to the finite volume: the behavior of the field in any finite region of space shall not be affected by its values at an infinitely remote plane. If you are still not convinced, I have my last line of defense: it works!

7.3 Quantization of Electromagnetic Field and Harmonic Oscillators 241

Now, I am ready to fix the normalization parameter a introduced in Eqs. 7.84 and 7.85. For the reasons which will become clear later, I will choose it to be

a D p 2!= ."0V/; (7.89)

so that the final expression for the energy of the field becomes

H D ! 2

�E20 C B20 � : (7.90)

Did you notice that the dependence on the volume in the Hamiltonian is gone? This is fiction of course, because I have simply hidden it inside formulas for the fields, but in all expressions concerned with actual physical observables, it will vanish in all honesty.

Equation 7.90 looks very much like the Hamiltonian of a harmonic oscillator. The first term can be interpreted as kinetic energy with E0 playing the role of the canonical momentum and term 1=! replacing the mass, and the second term is an analog of the potential energy with B0 as a conjugated coordinate (note that the coefficient me!2=2 in the harmonic oscillator potential energy is reduced to !=2 factor in Eq. 7.90 if you replace me with 1=!). If you wonder why I chose the electric field to represent the momentum and the magnetic field to be the coordinate, and not vice versa, just compare Eqs. 7.86 and 7.87 with Hamiltonian equations 7.8 and 7.7, paying attention to the placement of the negative sign in these equations. You can easily see that the Hamiltonian equations reproduce Eqs. 7.86 and 7.87 justifying this identification. But do not be fooled. Identifying magnetic field with coordinate and electric field with momentum is, of course, a matter of convention resulting from the choice to place the negative sign in Eq. 7.85.

The Hamiltonian formulation of the classical Maxwell equations allows me now to introduce the quantum description of the fields. This is done by promoting E0 and B0 to operators with the standard canonical commutation relation:

h OB0; OE0 i

D i„: (7.91)

As a result, the classical Hamiltonian, Eq. 7.90, becomes a Hamiltonian operator:

OH D ! 2

� OE20 C OB20 � : (7.92)

It is easy to see from Eq. 7.90 that both E0 and B0 have the dimension ofp energy � time so that the dimension of the commutator on the left-hand side of

Eq. 7.91 is energy � time, which coincides with the dimension of Planck’s constant, as it should. This result is not particularly surprising, of course, but it is always useful to check your dimensions once in a while just to make sure that your theory does not have any of the most basic problems. Using Eq. 7.91 together with Eqs. 7.84 and 7.85, I can compute the commutator of the non-zero components of the electric

242 7 Harmonic Oscillator Models

and magnetic fields, which, of course, are now also operators:

h OBy; OEx i

D � i„! "0cV

sin 2kz: (7.93)

One immediate consequence of this result is the uncertainty relation for these components:

4By4Ex � „! 2"0cV

jsin 2kzj ; (7.94)

which shows that just like the coordinate and momentum, electric and magnetic fields cannot both be known with certainty in the same quantum state.

Canonical commutator, Eq. 7.91, also indicates that in the representation using eigenvectors of OB0 as a basis, in which states are represented by wave functions dependent on the magnetic field amplitude OB0, the representation of the electric field amplitude operator OE0 is

OE0 D �i„ @ @B0 ;

while the Hamiltonian takes the form

OH D ! 2

�„2 @

2

@B20 C B20

:

Comparing this expression with the quantum Hamiltonian of the harmonic oscillator in the coordinate representation, you can see that they are mathematically identical if again you replace me with 1=!. The wave functions representing eigenvectors of this Hamiltonian can be in this representation written down as

'n.B0/ D 1p 2nnŠ�em

p �

exp

� B

2 0

2�2em

Hn

B0 �em

(7.95)

where the characteristic scale of the quantum fluctuations of the magnetic field, �em, is determined solely by Planck’s constant �em D

p„ (this result follows from Eq. 7.42 after the substitution me D 1=!). As with any wave function, j'n .B0/j2 determines the probability density function for the magnetic field amplitude.

While it is interesting to see how one can turn the coordinate representation of the harmonic oscillator into the magnetic field representation of the quantum electromagnetic theory, the practical value of this representation is quite limited. Much more important, from both theoretical and practical points of view, is the opportunity to introduce electromagnetic analogs of lowering and raising operators. In order to distinguish these operators from those used in the harmonic oscillator problem, I will use notation Ob and Ob� (do not confuse these operators with variables

7.3 Quantization of Electromagnetic Field and Harmonic Oscillators 243

b used in the description of the classical oscillator), where

Ob D r 1

2„ OB0 C i

OE0p 2„ (7.96)

Ob� D r 1

2„ OB0 � i

OE0p 2„ : (7.97)

Equations 7.96 and 7.97 are obtained from Eqs. 7.22 and 7.23 by setting me! D 1 and replacing Ox and Op by OB0 and OE0 correspondingly. Hamiltonian 7.92 expressed in terms of these operators acquires a familiar form:

OH D „! �Ob� Ob C 1=2

� :

All commutators, which were computed in Sect. 7.1.1, remain exactly the same, so I can simply reproduce the results from that section: the energy eigenvalues of the electromagnetic field are given again by

En D „!

n C 1 2

; (7.98)

while eigenvectors can be constructed from the ground state j0i as

jni D 1p nŠ

�Ob� �n j0i : (7.99)

Formally, both these results are exactly the same as in the case of the harmonic oscillator. However, the physical interpretation of the integer n in these expressions and, therefore, of both energy values and eigenvectors is completely different.

Indeed, in the case of a harmonic oscillator, we have a material particle, which can be placed in states with different energies, counted by the integer n. The electromagnetic field, on the other hand, once created, carries a certain amount of energy, and the same field cannot be made to have “more” energy. To produce a field with higher energy, you need to increase its amplitude, i.e., add “more” field. The discrete nature of allowed energy levels tells us that the energy of the field can only be increased in finite increments: to go from a state of electromagnetic field with energy En to the state with energy EnC1, you have to add a discrete “quantum” of field with energy „!. This discrete energy quantum is what was introduced by Einstein in 1905 as “das Lichtquantas.” Replacing the term “quantum of light” with the term “photon,”3 you can say that number n is the number of photons in a given state and that going from state jni to state jn C 1i amounts to generating

3It is interesting that the term “photon” was used for the first time in an obscure paper by an American chemist Gilbert Lewis in 1926. His paper is forgotten, but the term he coined lives on.

244 7 Harmonic Oscillator Models

or creating an extra photon, while transitioning to state jn � 1i means removing or annihilating a photon. To emphasize this point, operators Ob� and Ob are called in the context of quantum electromagnetic field theory “creation” and “annihilation” operators, respectively, rather than lowering and raising operators. The ground state j0i in this interpretation is the state with zero photons and is called, therefore, the vacuum state. A counterintuitive aspect of the vacuum state is that even though it is devoid of photons, it still has non-zero energy, which in our oversimplified model is just „!=2. To one’s mind it might appear as a nonsensical result: how can zero photons have non-zero energy? I hope it will not blow your mind away if I say that in a more complete theory, which takes into account multiple modes (waves with different wave vectors k) of electromagnetic field, the “vacuum” energy might become formally infinite. In order to wrap your mind around this weird result, consider the following.

The photon is not just “a quantum of electromagnetic field” as you might have read in popular books and introductory physics texts. The concept of a “photon” has a quite specific mathematically rigorous meaning: a single photon is an eigenvector of the electromagnetic Hamiltonian characterized by n D 1. Eigenvectors characterized by higher values of n describe n-photon states. The states described by eigenvectors of the Hamiltonian are not the states in which the electric or magnetic field has any definite value. Moreover, the commutation relation, Eq. 7.93, and following from it uncertainty relation 7.94 indicate that there are no states in which electric and magnetic fields both have definite values. Moreover, in the states with fixed photon numbers, the expectation values of electric and magnetic fields are zeroes just like the expectation values of coordinate and momentum operators of the mechanical harmonic oscillator. At the same time, the expectation values of the squares of the fields are not zeroes, and these are the quantities which determine the energy of the fields. These are what we call vacuum fluctuations of electromagnetic field, where vacuum has, again, a very specific meaning—it is not just emptiness or a void; it is a state with zero photons, which is not the same as a state with zero field.

The second issue which needs to be discussed in connection with vacuum energy is, again, the fact that a zero level of energy is always established arbitrarily. The vacuum energy, which we found, is counted from the (non-existent in quantum theory) state, in which both electric and magnetic fields are presumed to be zeroes. As long as the energy of the vacuum state does not change, while the phenomena we are interested in play out, we can set the vacuum energy to zero with no consequences for any physically significant results. To provide a counterexample to this statement, let me briefly describe a situation in which this assumption might not be true. If you consider the electromagnetic field between two conducting plates, the modes of the field and, therefore, its vacuum energy depend on the distance between the plates. This distance can be changed, in which case the vacuum energy also changes. Because of this capacity to change, it becomes relevant resulting in a tiny but observable attractive force acting between the plates known as the Casimir force. In most other situations, however, the vacuum energy is just a constant, whose value (finite or infinite) has no physical significance.

7.3 Quantization of Electromagnetic Field and Harmonic Oscillators 245

Thus, the eigenvectors of the electromagnetic Hamiltonian representing states with a definite number of photons, n, bear little resemblance to classical elec- tromagnetic waves just like stationary states of the harmonic oscillator have no relation to the motion of the classical pendulum. At the same time, in Sect. 7.1.2, I demonstrated that a generic nonstationary state reproduces oscillations of the expectation values of coordinate and momentum resembling those of their classical counterparts. While this result is true for a generic initial state, and the behavior of the expectation values to a large extent does not depend on their details, not all initial states are created equal. However, to notice the difference between them, we have to go beyond the expectation values and consider the uncertainties of both coordinate and momentum or, in the electromagnetic context, of electric and magnetic fields. The fact that different initial states result in different behavior of uncertainties has already been demonstrated in the examples presented in Sect. 7.1.2. However, out of all the multitude of various initial states, there exists one, for which these uncertainties are minimized in a sense that their product has the smallest allowed by the uncertainty principle value. In the electromagnetic case it means that the sign � in Eq. 7.94 is replaced with D. These states are called “coherent” states, and they are much more important in the electrodynamics rather than in mechanical context, so this is where I shall deal with them.

7.3.2 Coherent States of the Electromagnetic Field

The coherent states are defined as eigenvectors of the annihilation operator:

Ob j˛i D ˛ j˛i : (7.100)

Since the annihilation operator is not Hermitian, you should not expect the eigenvalues to be real, and we do not know yet if they are continuous or discrete. I can, however, try to find the representation of vectors j˛i in the basis of the eigenvectors jni of the electromagnetic Hamiltonian:

j˛i D 1X

nD0 cn jni ; (7.101)

where cn D hn j˛i. The Hermitian conjugation of Eq. 7.99 yields

hnj D 1p nŠ

h0j �Ob �n

(7.102)

so that I can find for the expansion coefficients

246 7 Harmonic Oscillator Models

cn D 1p nŠ

h0j �Ob �n j˛i D ˛

n

p nŠ

h0j ˛i :

The only unknown quantity here is c0 D h0j ˛i, which I find by requiring that j˛i is normalized, which means that

P n jcnj2 D 1. Applying this last condition, I have

jc0j2 1X

nD0

j˛j2n nŠ

D jc0j2 exp � j˛j2

� D 1

where I recalled that P .xn=nŠ/ is a power series expansion for the exponential

function of x. Thus, choosing c0 to be real-valued, I have the following final expression for the expansion coefficients:

cn D e� j˛j 2

2 ˛np

nŠ : (7.103)

Equation 7.103 together with Eq. 7.101 completely defines a coherent state with eigenvalue ˛. Since the derivation of the eigenvector did not produce any restrictions on ˛, it must be presumed to be a continuous complex-valued variable. The vector that I found describes a state which is the superposition of states with different numbers of photons and, respectively, with different energies. Respectively, the number of photons in this case is a random quantity with a probability distribution given by

pn D jcnj2 D e�j˛j2 j˛j 2n

nŠ : (7.104)

Equation 7.104 describes a well-known probability distribution, called the Poisson distribution, which appears in a large number of physical and mathematical problems. This distribution describes the probability that n events will happen within some fixed interval (of time or of distances) provided that the probability of each event is independent of the occurrence of the others and all events are happening at a constant rate (probability per unit time or unit length or unit volume does not depend upon time or position). This distribution describes, for instance, the probability that n atoms will undergo radioactive decay within some time interval or the number of uniformly distributed non-interacting gas molecules that will be found occupying some volume in space. For more examples of the Poisson distribution, just google it. The entire Poisson distribution depends on a single parameter j˛j2, whose physical meaning can be elucidated by computing the mean (or expectation value) of the number of photons Nn˛ in the state j˛i:

Nn˛ D 1X

nD0 npn D e�j˛j2

1X nD0

n j˛j2n

nŠ D

7.3 Quantization of Electromagnetic Field and Harmonic Oscillators 247

e�j˛j 2

1X nD1

j˛j2n .n � 1/Š D e

�j˛j2 1X

kD0

j˛j2.kC1/ kŠ

D

e�j˛j 2 j˛j2

1X kD0

j˛j2k kŠ

D e�j˛j2 j˛j2 e�j˛j D j˛j2 ;

where in the second line, I first took into account that the n D 0 term in the sum is multiplied by n D 0 and, therefore, does not contribute. Accordingly I started the sum with n D 1, after which I introduced a new index k D n � 1, which reset the counter back to zero. As a result, I gained an extra term j˛j2, while the remaining sum became just an exponential function canceling out the normalization term e�j˛j2 . This calculation shows that j˛j2 has the meaning of the average number of photons in the state with eigenvalue ˛. It is also interesting to compute

the uncertainty of the number of photons in this state 4n D rD .n � Nn˛/2

E D

phn2i˛ � Nn2˛ . First, I compute ˝ n2 ˛ ˛ :

˝ n2 ˛ ˛

D e�j˛j2 1X

nD0 n2

j˛j2n nŠ

D e�j˛j2 1X

nD1 n

j˛j2n .n � 1/Š D

e�j˛j 2

1X kD0

.k C 1/ j˛j2.kC1/ kŠ

D e�j˛j2 j˛j2 1X

kD0

j˛j2k kŠ

C

e�j˛j 2 j˛j2

1X kD0

k j˛j2k kŠ

D j˛j2 C j˛j4 ;

where I used the same trick with the sum as above, twice. Now I can find that 4n D pNn˛ . The relative uncertainty of the photon numbers 4n=Nn˛ D 1=

pNn˛ and becomes progressively smaller as the average number of photons increases. The decrease of the quantum fluctuations signifies transition to classical behavior, and one can suppose, therefore, that in the limit Nn˛ 1, the electric and magnetic fields in this state will reproduce behavior typical for a classical electromagnetic wave. To verify this assumption, I will compute the expectation values and uncertainties of the electric and magnetic fields for this state as well as will consider their time dependence.

Reversing Eqs. 7.96 and 7.97, I find for the fields

OB0 D r

„ 2

�Ob C Ob� �

(7.105)

OE0 D i r

„ 2

�Ob� � Ob � : (7.106)

248 7 Harmonic Oscillator Models

Taking squares of these expressions yields

OB20 D „ 2

�Ob2 C Ob�2 C ObOb� C Ob� Ob �

D „ 2

�Ob2 C Ob�2 C 2Ob� Ob C 1 �

(7.107)

OE20 D � „ 2

�Ob2 C Ob�2 � ObOb� � Ob� Ob �

D �„ 2

�Ob2 C Ob�2 � 2Ob� Ob � 1 �

(7.108)

where I changed the order of operators in ObOb� using the commutation relationhOb; Ob� i

D 1. Now I am ready to tackle both the expectation values and the uncertainties. The computation of the expectation values

D OB0 E

and D OE0 E

is almost

trivial: taking into account that h˛j Ob j˛i D ˛ and h˛j Ob� j˛i D h˛j Ob j˛i� D ˛�, I have

D OB0 E

D r

„ 2

� ˛ C ˛�� (7.109)

D OE0 E

D i r

„ 2

� ˛� � ˛� : (7.110)

The expectation values of the squares of the fields take just a bit more work: before computing h˛j Ob� Ob j˛i, I first need to realize that the Hermitian conjugate of expression Ob j˛i D ˛ is h˛j Ob� D ˛�. With this little insight, the rest of the computation is as trivial as that for the expectation values. The result is

D OB20 E

D „ 2

� ˛ C ˛��2 C „

2 (7.111)

D OE20 E

D „ 2

� ˛� � ˛�2 C „

2 : (7.112)

Finally, the uncertainties of both fields are found to be independent of ˛ and equal to

4 OB0 D 4 OE0 D r

„ 2

so that their product indeed is the smallest allowed by the uncertainty principle

4 OB04 OE0 D „=2. Relative uncertainties 4 OB0= D OB0

E diminish with the increase in

j˛j D pNn˛ and vanish in the limit Nn˛ ! 1, which obviously corresponds to the classical (no quantum fluctuations) limit. This result provides an additional reinforcement to the idea that the electromagnetic field in the coherent states is as close to a classical wave as possible.

Finally, I will consider how these quantities (the expectation values and uncer- tainties) change with time. The easiest way to do this is to use the Heisenberg picture, in which all dynamics is given by the time dependence of the annihilation

7.4 Problems 249

operator, which as we know from the consideration of harmonic oscillator is very simple ObH.t/ D Ob exp .�i!t/, so that h˛j ObH j˛i D ˛ exp .�i!t/. With this I immediately find for the field expectation values

D OB0.t/ E

D r

„ 2

� ˛e�i!t C ˛�ei!t� (7.113)

D OE0.t/ E

D i r

„ 2

� ˛�ei!t � ˛e�i!t� (7.114)

and for their squares

D OB20.t/ E

D „ 2

� ˛e�i!t C ˛�ei!t�2 C „

2 (7.115)

D OE20 .t/ E

D „ 2

� ˛�ei!t � ˛ei!t�2 C „

2 : (7.116)

It is remarkable that the uncertainties of the fields D OB20.t/

E � D OB0.t/

E2 , D OE20 .t/

E �

D OE0.t/ E2

remain time independent and satisfy the minimal form of the uncertainty

principle at all times. While the harmonic time dependence of the expectation value is typical for almost any initial state, the uncovered behavior of the uncertainties is the special property of the coherent states and is what makes them so special. This also guarantees that the shape of the coherent superposition of the stationary states does not get distorted with time similar to what one would expect from a classical electromagnetic wave.

7.4 Problems

Problems for Sect. 7.1

Problem 83 Using Eq. 7.16 together with Eqs. 7.12 and 7.13, find the time depen- dence of the coordinate x and momentum p. Comparing the found result with Eq. 7.9, find the relation between parameters b0; b

� 0 and x0; p0.

Problem 84 Verify that Eqs. 7.14 and 7.15 are equivalent to the Hamiltonian equa- tions for the regular coordinate and momentum by computing the time derivatives of the variables b and b

� using Eqs. 7.12 and 7.13 together with Eqs. 7.7 and 7.8.

Problem 85 Prove that Oa jni D pn jn � 1i.

250 7 Harmonic Oscillator Models

Problem 86 Suppose that a harmonic oscillator is at t D 0 in the state described by the following superposition:

j˛0i D a �p

2 j0i C p3 j1i � :

1. Normalize the state. 2. Find a vector j˛.t/i representing the state of the oscillator at an arbitrary time t. 3. Calculate the uncertainties of the coordinate and momentum operators in this

state, and check that the uncertainty relation is fulfilled at all times.

Problem 87 Using the method of mathematical induction, prove that

y � d

dy

n exp

�y

2

2

D .�1/n exp

�y

2

2

dn exp

��y2�

dyn

and derive Eq. 7.44 for the coordinate representation of an eigenvector of the Hamiltonian of the harmonic oscillator.

Problem 88 Using matrices amn D hmj Oa jni I a�mn D hmj Oa� jni, demonstrate by direct matrix multiplication that

�Oa� Oa�mn D X

k

a�mkakn D mımn:

Problem 89 Using the coordinate representation of the lowering operator Oa, apply it to the coordinate representation of the n D 3 stationary state of the harmonic oscillator. Is the result normalized? If not, normalize it and compare the found normalization factor with Eq. 7.40.

Problem 90 Using lowering and raising operators, compute the expectation value of the kinetic, OK, and potential, OV; energies of a harmonic oscillator in an arbitrary stationary state jni. Check that

D OK E

D D OV E :

This result is a particular case of the so-called virial theorem relating the expectation values of kinetic and potential energies of a particle in the potential described by

V D kxp. The general form of the theorem is 2 D OK E

D p D OV E , which for p D 2

(harmonic oscillator) is reduced to the result of this problem.

Problem 91 Derive explicit expressions for Hermite polynomials with n D 3; 4; 5 (of course, you can always google it, but do it by yourselves—you can learn something), and demonstrate explicitly that they obey the orthogonality relation:

1̂

�1 exp

��x2�Hm.x/Hn.x/dx D 0; m ¤ n:

7.4 Problems 251

Problem 92

1. Find eigenvectors j˛i of the lowering operator Oa: Oa j˛i D ˛ j˛i in the coordinate representation. Normalize them.

2. Show that the raising operator Oa� does not have normalizable eigenvectors. Problem 93 Compute a probability that a measurement of the coordinate will yield the value in the classically forbidden region for the oscillator prepared in each of the following stationary states: j0i, j1i, and j3i. (Note that the boundary of classically allowed regions is different for each of these states.)

Problem 94 Consider an electron with mass me and charge �e in a harmonic potential OV D me!2x2=2 also subjected to a uniform electric field E in the positive x direction.

1. Write down the Hamiltonian for this system. 2. Using operator identities from Sect. 3.2.2, prove that

exp

iOpxd „

Ox exp

� iOpxd„

D Ox C d; (7.117)

where Ox and Opx are regular operators of the coordinate and the respective components of the momentum and d is a real number.

3. In Sect. 5.1.2 I already demonstrated using the example of a parity operator that

if two vectors are related to each other as jˇi D OT j˛i, while vectors ˇ̌ ˇ Q̌ E

and

j Q̨ i are defined as ˇ̌ ˇ Q̌ E

D OU jˇi, j Q̨ i D OU j˛i, one can show that ˇ̌ ˇ Q̌ E

D OT 0 j Q̨ i, where OT 0 D OU OT OU�1. Use this relation together with Eq. 7.117 to reduce the Hamiltonian found in Part I of this problem to that of a harmonic oscillator without the electric field, and express the eigenvectors of the Hamiltonian with the field (perturbed Hamiltonian) in terms of the eigenvectors of the Hamiltonian without the field (unperturbed).

4. Write down the coordinate wave function representing the states of the perturbed Hamiltonian in terms of the wave functions representing the states of the unperturbed Hamiltonian. Comment on the results. Explain how it can be derived by manipulating the classical Hamiltonian before its quantization.

5. If the electron is in its ground state before the electric field is turned on, find the probability that the electron will be found in the ground state of the Hamiltonian with the electric field on. (Hint: You will need to use operator identities concerning with the exponential function of the sum of the operators discussed in Sect. 3.2.2 and the representation of the momentum operator in terms of raising and lowering operators. Remember: The exponential function of an operator is defined as a corresponding power series.)

252 7 Harmonic Oscillator Models

Problems for Sect. 7.1.2

Problem 95 Using the Heisenberg representation, find the uncertainty of coordi- nate and momentum operators at an arbitrary time t for the state

j˛i D 1p 3 .j1i C j2i C j3i/ ;

where jni is nth stationary state of the harmonic oscillator. Verify that the uncertainty relation is fulfilled at all times.

Problem 96 Solve the previous problem using the Schrödinger representation.

Problem 97 Consider the system described in Problem 94, but work now in the Heisenberg picture.

1. Write down the Hamiltonian of the electron in the Heisenberg picture. 2. Write down the Heisenberg equations for lowering and raising operators and

solve them. 3. Now, assume that the electric field was turned on at t D 0, when the electron

was in the ground state j0i of the unperturbed Hamiltonian, and turned back off at t D tf . In the Heisenberg picture, the state of the system does not change, so that all time evolution is described by the operators. Let us call the lowering and raising operators at t D 0 Oain, Oa�in (these are, obviously, the same operators that appear as initial conditions in the solutions of the Heisenberg equations found in Part I of the problem). These operators are just lowering and raising operators in the Schrödinger picture, so that the initial state obeys equation Oain j0i D 0. In the Heisenberg picture, raising and lowering operators change with time according to the expressions found in Part I. Considering these expressions at t D tf , you will find Oaf � Oa

� tf � , and Oa�f D Oa�

� tf � . Verify that these operators have the same

commutation relation as their Schrödinger counterparts. 4. The time evolution of the Hamiltonian, which at all times has the form found in

Part I, is completely described by the time dependence of lowering and raising operators. Using the expressions for Oaf and Oa�f found in the previous part of the problem, write down the Hamiltonian of the electron at times t > tf in terms of operators Oain, Oa�in.

5. Using the found expression for the Hamiltonian, find the expectation value of energy in the given initial state.

6. The Hamiltonian of the electron at t > tf has the same form in terms of operators Oaf , Oa�f , as the Hamiltonian for t < t0 has in terms of operators Oain, Oa�in. Also, it has been shown in Part III that Oaf , Oa�f have the same commutation relations as Oain, Oa�in. This means that Hamiltonian t > tf has the same eigenvalues, and its eigenvectors satisfy the same relations:

7.4 Problems 253

Oaf j0if D 0

jnif D 1p nŠ

Oa�f j0if

where the first equation defines the new vacuum state j0if and the second equation defines new eigenvectors. Since operators Oaf , Oa�f differ from Oain, Oa�in, the new ground state and the new eigenvectors will be different from those of the initial Hamiltonian. Using the representation of Oaf in terms of Oain, find the probability that if the system started out in the ground state of the initial Hamiltonian, it will be found in the new ground state j0if .

Problems for Sect. 7.2

Problem 98 Verify Eqs. 7.71 and 7.72.

Problem 99 Rewrite the Schrödinger equation for the stationary states of a 3-D isotropic harmonic oscillator in cylindrical coordinates, �; '; z. Show that the wave function can be written down �n1;n2;m D Zn1 .z/Rn2 .�/ exp.im'/, and derive equations for functions Zn1 .z/ and Rn2 .�/. The first of these equations will coincide with the Schrödinger equation for a one-dimensional harmonic oscillator, so you can use the results of Sect. 7.1.1 to determine this function and the corresponding contribution to energy, but the equation for Rn2 .�/ will have to be solved from scratch. Do it using the power series method developed in the text for the spherical coordinates.

Problem 100 You just saw that the wave functions of an isotropic oscillator can be presented using Cartesian, spherical, and cylindrical coordinates. While each of these functions, corresponding to the same degenerate energy value, has very different forms, since all of them represent the same eigenvectors belonging to the corresponding eigenvalue, you shall be able to present each of them as linear combinations of the others belonging to the same eigenvalue. Verify that this is indeed the case for states belonging to energy value E D 5„!=2 by explicitly expressing wave functions written in Cartesian coordinates in terms of their spherical and cylindrical coordinate counterparts.

Problems for Sect. 7.3.2

Problem 101 Verify Eqs. 7.115 and 7.116 for time-dependent expectation values of the squares of electric and magnetic fields.

254 7 Harmonic Oscillator Models

Problem 102 The flow of the energy of the electromagnetic field is described by the Poynting vector, which in SI units is given by

S D 1 �0

E � B:

In our toy model of the electromagnetic field, the Poynting vector becomes simply

S D 1 �0

ExBy:

In quantum theory, the Poynting vector becomes an operator. Find the time- dependent expectation value and uncertainty of this operator in the coherent state.

Chapter 8 Hydrogen Atom

8.1 Transition to a One-Body Problem

Quantum mechanics of the atom of hydrogen, understood as a system consisting of a positively charged nucleus and a single negatively charged electron, is remarkable in many respects. This is one of the very few exactly solvable three-dimensional models with realistic interaction potential. As such it provides the foundation for much of our qualitative as well as quantitative understanding of optical properties of atoms at least as a first approximation for more complicated situations. A similar model also arises in the physics of semiconductors, where bound states of negative and positive charges form entities known as excitons, as well as in the situations involving a single conductance electron interacting with a charged impurity. Another curious property of this model is that the energy eigenvalues emerging from the exact solution of the Schrödinger equation coincide with energy levels predicted by the heuristic Bohr model based on a rather arbitrary combination of Newton’s laws with a simple quantization rule for the angular momentum. While it might seem as a pure coincidence of a limited significance given that by now we have harnessed the full power of quantum theory and do not really need Bohr’s quantization rules, one still might wonder by how much the development of quantum physics would have been delayed if it were not for this “coincidence.”

I will begin exploration of this model with a brief reminder of how classical mechanics deals with the problem. There are two different aspects to it which need to be addressed. First, unlike all previous models considered so far, which involved a single particle, this is a two-body problem. Luckily for us, this problem only pretends to be two-body and can be easily reduced to two single-particle problems. This is how it is done in classical physics. The classical Hamiltonian of the problem has the following form:

H D p 2 1

2mp C p

2 2

2me C V .jr1 � r2j/ ; (8.1)

© Springer International Publishing AG, part of Springer Nature 2018 L.I. Deych, Advanced Undergraduate Quantum Mechanics, https://doi.org/10.1007/978-3-319-71550-6_8

255

256 8 Hydrogen Atom

where p1;r1 and p2; r2 are momentums and positions of the particles with corre- sponding masses me and mp and V .jr1 � r2j/ is the Coulomb potential energy, which in the SI units can be written as

V .jr1 � r2j/ D � 1 4�"r"0

Ze2

jr1 � r2j : (8.2)

Here e is the elementary charge, Z is the atomic number of the nucleus introduced to allow dealing with heavier hydrogen-like atoms such as atoms of alkali metals (or a charged impurity), and "r is the relative dielectric permittivity accounting for a possibility that the interacting particles are inside a dielectric medium. To separate this problem into two single-particle problems, I introduce new coordinates:

R D mpr1 C mer2 mp C me ; (8.3)

r D r1 � r2: (8.4)

I hope you have recognized in R a coordinate of the center of mass of two particles and in r their relative position vector. Now, I need to find the new momentums associated with these coordinates. For your sake I will avoid using formalism of canonical transformations in Hamiltonian mechanics and will begin with defining the kinetic energy in terms of respective velocities. Reversing Eqs. 8.3 and 8.4, I get

r1 D R C me mp C me r;

r2 D R � mp mp C me r;

so that the kinetic energy can be found as

K D 1 2

mp

dR dt

C me mp C me

dr dt

2 C 1 2

me

dR dt

� mp mp C me

dr dt

2 D

1

2

� mp C me

� dR dt

2 C mpm

2 e�

mp C me �2

dr dt

2 C mem

2 p�

mp C me �2

dr dt

2 D

1

2

� mp C me

� dR dt

2 C mpme

mp C me

dr dt

2 :

Introducing two new masses—total mass of the system M D mp C me and reduced mass � D mpme=

� mp C me

� —I can define the momentum of the center of mass:

pR D M dR dt ;

8.1 Transition to a One-Body Problem 257

and relative momentum

pr D � dr dt ;

so that the Hamiltonian, Eq. 8.1, can be rewritten as

H D p 2 R

2M C p

2 r

2� C V.r/:

The corresponding Hamiltonian equations are separated into a pair of equations for the position and momentum of the center of mass:

dR dt

D pR M

dpR dt

D 0;

and for the relative motion

dr dt

D pr M

dpr dt

D �dV dr :

The first pair of these equations describes a uniform motion of a free particle—the center of mass of the system—while the second pair describes the motion of a single particle in potential V.r/.

I have little doubts that variables r and pr form a canonically conjugated pair, and so I can transition to the quantum description by promoting them to operators with standard commutation relation

� ri; prj

� D i„ıi;j. However, in order to be 100% sure and convince all the possible skeptics, I do need to verify this fact by computing Poisson brackets with these variables. To this end I need to express r and pr in terms of initial coordinates and momentums. Expression for r is given by Eq. 8.4, so I only need to figure out pr:

pr D memp

me C mp

dr1 dt

� dr2 dt

D me

me C mp p1 � mp

me C mp p2: (8.5)

Let me focus for concreteness on x-components of the momentum and coordinate. Equation 3.4 for the Poisson bracket, where summation must include the sum over coordinates of both particles, yields

fx; prxg D @x @x1

@prx @p1x

C @x @x2

@prx @p2x

D me me C mp C

mp me C mp D 1

as expected. All other Poisson brackets also predictably produce necessary results, so you can start breathing again.

258 8 Hydrogen Atom

8.2 Eigenvalues and Eigenvectors

It is important that the portion of the Hamiltonian describing the motion of the center of mass, OHR D Op2R=2M, is completely independent from the part responsible for the relative motion

OHr D Op 2 r

2� C V.Or/ (8.6)

so that the eigenvectors of the total Hamiltonian OH D OHR C OHr can be written down as j�Ri j�ri, where the first vector is an eigenvector of OHR with eigenvalue ER, while the second vector is the eigenvector of OHr with its own eigenvalue Er. The eigenvalue of the total Hamiltonian is easily verified to be ER C Er (when verifying this statement, remember that OHR acts only on j�Ri, while OHr only affects j�ri). I am going to ignore the center of mass motion and will focus on the Hamiltonian OHr, Eq. 8.6, with the Coulomb potential energy, Eq. 8.2. In what follows I will omit subindex r in the Hamiltonian.

What I am dealing with here is yet another example of a particle moving in a central potential, similar to the isotropic harmonic oscillator problem considered in Sect. 7.2. Just like in the case of a harmonic oscillator, Hamiltonian 8.6 commutes with angular momentum operators OL2 and OLz; thus its eigenvectors are also eigenvectors of the angular momentum. Working in the position representation and using spherical coordinates to represent the position, I can again write down for the wave function

n;l;m .r; �; '/ D Yml .�; '/Rnl.r/

where Yml .�; '/ are spherical harmonics—coordinate representation of the eigen- vectors of angular momentum operators. The equation for the remaining radial function Rnl .r/ is derived in exactly the same way as in Sect. 7.2 and takes the form similar to Eq. 7.71:

� „ 2

2�r2 d

dr

r2 @Rnr ;l @r

C „

2l.l C 1/ 2�r2

Rnr ;l � 1

4�"r"0

Ze2

r Rnr ;l D El;nr Rnr ;l (8.7)

with obvious replacements of me ! � and quadratic harmonic oscillator potential for the Coulomb potential. Eigenvalues of energy El;n are found by looking for normalizable solutions to this equation. My choice of the indexes to label the eigenvalues reflects the fact that the eigenvalues of the Hamiltonian with any central potential do not depend on m. Indeed, quantum number m is defined with respect to a particular choice of the polar axis Z, but since the energy of a system with a central potential cannot depend upon an arbitrary axis choice, it should not depend on this quantum number. Here is another example of how symmetry consideration helps to analyze the problem.

8.2 Eigenvalues and Eigenvectors 259

I will begin by reducing Eq. 8.7 to a dimensionless form as it is customary in this type of situations. What I need for this is a characteristic length scale, which in this problem, unlike the harmonic oscillator case, is not that obvious. But there is a trick which I can use to find it, and I am going to share it with you. Just by looking at the radial equation, I know that there are three main parameters: mass �, charge e, and Planck’s constant „ in this problem, and I need to find their combination with the dimension of length. This is done by first writing down this combination in the most generic form as �˛ Qeˇ„� , where Qe D e=p4�"0"r is the combination of charge, vacuum, and relative permittivity, "0 and "r correspondingly appearing in the Coulomb law in SI units, while ˛; ˇ, and � are unknown powers to be determined. In the next step, I will present the dimension of each factor in this expression in terms of the basic quantities: length, time, and mass. For instance, the dimension of Qe can be found from the Coulomb law as ŒQe� D ŒF�1=2 ŒL� , where ŒF� stands for the dimension of force and ŒL� stands for the dimension of length. The dimension of force in basic quantities is ŒF� D ŒM� ŒL� ŒT��2, where ŒM� represents the dimension of mass and ŒT� represents the dimension of time (think of Newton’s second law). So, for the effective charge, I have ŒQe� D ŒM�1=2 ŒL�3=2 ŒT��1. The dimension of Planck’s constant can be determined from the Einstein–de Broglie relation between energy and frequency as Œ„� D ŒE� ŒT� D ŒM� ŒL�2 ŒT��1, where in the second step I expressed the dimension of energy ŒE� as ŒE� D ŒF� ŒL�. Combining the results for the charge and Planck’s constant, I find

�˛ Qeˇ„� D ŒM�˛ ŒM�ˇ=2 ŒL�3ˇ=2 ŒT��ˇ ŒM�� ŒL�2� ŒT��� D ŒM�˛Cˇ=2C� ŒL�3ˇ=2C2� ŒT��ˇ�� :

If I want this expression to have the dimension of length ŒL�, I need to eliminate the excessive dimensions such as ŒM� and ŒT�. Remembering that any quantity raised to the power of zero turns to unity and becomes dimensionless, I can eliminate ŒM� and ŒT� requiring that their corresponding powers vanish:

˛ C ˇ=2C � D 0 ˇ C � D 0:

Then all what is left to do is to make the power of L equal to unity:

3ˇ=2C 2� D 1:

The result is the system of equations for unknown powers, solving which I find � D 2, ˇ D �2, and ˛ D �1, i.e., the characteristic length scale can be constructed using the parameters at our disposal as

aB D 4�"0"r„ 2

e2� : (8.8)

260 8 Hydrogen Atom

The found characteristic length is actually well known from Bohr’s theory of atomic spectra and is called Bohr radius. It can be used to introduce a dimensionless coordinate & D r=aB and rewrite Eq. 8.7 as

� 1 &2

d

d&

&2

dRnr ;l d&

C l.l C 1/

&2 Rnr ;l �

2Z

& Rnr ;l D

2 .4�"0"r/ 2 „2

e4� El;nr Rnr ;l:

(8.9)

You can verify (do it yourselves) that quantity

QE D e 4�

32�2"20" 2 r „2

(8.10)

has the dimension of energy, so that I can present the right-hand side of this equation in terms of dimensionless energy parameter

�l;n D El;n= QE:

Finally, introducing auxiliary radial function un;l D &Rn;l (the same as in the harmonic oscillator problem), I obtain the effective one-dimensional Schrödinger equation similar to Eq. 7.73:

� d 2unr ;l d&2

C l.l C 1/ &2

unr ;l � 2Z

& unr ;l D �l;nr unr ;l: (8.11)

The effective potential in Eq. 8.11 is positively infinite at small & , but as & increases, it, unlike the harmonic oscillator problem, becomes negative, reaches a minimum value of �Z2= Œl.l C 1/� at & D l.l C 1/=Z, and remains negative while approaching zero for & ! 1; see Fig. 8.1. Classical behavior in such a potential is bound for negative values of energy and unbound for positive energies. In the former case, we are dealing with a particle moving along a closed elliptical orbit, while in the

Fig. 8.1 Dimensionless effective potential as a function of dimensionless radial coordinate

8.2 Eigenvalues and Eigenvectors 261

latter case, the situation is better described in terms of scattering of a particle by the potential. In quantum description, as usual, we shall expect states characterized by the discrete spectrum of eigenvalues for classically bound motion (negative energies) and states with continuous spectrum for positive energies. The wave functions representing states of continuum spectrum are well known but rather complex mathematically and are not used too frequently, so I shall avoid dealing with them for the sake of keeping everyone sane. The range of negative energies is much more important for understanding the physical processes in atoms and is more tractable. In terms of atomic physics, the states with negative energies correspond to intact atoms, where the electron is bound to its nucleus, and the probability that it will turn up infinitely far from the nucleus is zero. The states of continuous spectrum correspond to ionized atoms, where energy of the electron is too large for the nucleus to be able to “catch” it so that the electron can be found at an arbitrary large distances from the nucleus.

The process of finding the solution of Eq. 8.11 follows the same steps as solving a similar equation for the harmonic oscillator: find an asymptotic behavior at small and large & , factor it out, and present the residual function as a power series. I, however, can simplify the form of the equation a bit more by replacing variable & with a new variable � D &p��l;n (remember �l;nr < 0 !). Equation 8.11 now takes the following form:

d2unr ;l d�2

� l.l C 1/ �2

unr ;l C l;nr �

unr ;l � unr ;l D 0 (8.12)

where I introduced a new parameter

l;nr D 2Zp��l;nr

:

The asymptotic behavior at low � is determined by the contribution from the angular momentum and is the same as for the harmonic oscillator, unr ;l / �lC1, but the large � limit is now determined by unr ;l term. The resulting equation

d2unr ;l d�2

D unr ;l

has two obvious solutions

unr ;l / exp .˙�/

of which I will only keep an exponentially decreasing one in hopes to end up with a normalizable solution. Thus, I am looking for the solution in the form

unr ;l D �lC1 exp .��/ vn;l .�/; (8.13)

262 8 Hydrogen Atom

where a differential equation for the reduced function vn;l is derived by substituting Eq. 8.13 into Eq. 8.11. The rest of the procedure is quite similar to the one I outlined in the harmonic oscillator problem: present vn;l as a power series with respect to �, derive recursion relation for the coefficients of the expansion, verify that the asymptotic behavior of resulting power series yields a non-normalizable wave function, restore normalizability by requiring that the power series terminates after a final number of terms, and obtain an equation for the energy values consistent with the normalizability requirement. Leaving details of this analysis to the readers as an exercise (of course, you can always cheat by looking it up in a number of other textbooks, but you will gain so much more in terms of your technical prowess and self-respect by doing it yourselves!), I will present the result. The only way to ensure normalizability of the resulting wave functions is to require that parameter l;nr satisfies the following condition:

l;nr D 2. jmax C l C 1/ (8.14) where jmax is the number of the largest non-zero coefficient in the power series expansion of function

vn;l .�/ D X

j

cj� j

and takes arbitrary integer values starting from 0. However, since jmax appears in Eq. 8.14 only in combination with l, the actual allowed values of l;nr depend on a single parameter, called a principal quantum number n, which takes any integer values starting from n D 1. Independence of the energy eigenvalues of a hydrogen Hamiltonian of the angular momentum number l is a peculiarity of the Coulomb potential and reflects an additional symmetry present in this problem. In classical mechanics this symmetry manifests itself via the existence of a supplemental (to energy and angular momentum) conserving quantity called Laplace–Runge– Lenz vector

A D p � L C � Ze 2

4�"r"0 er

where er is a unit vector in the radial direction. In quantum theory this vector can be promoted to a Hermitian operator, but this procedure is not trivial because operators p and L do not commute. However, I am afraid that if I continue talking about the quantum version of Laplace–Runge–Lenz vector, I might open myself to a lawsuit for inflicting cruel and unusual punishment on the readers, so I will restrain myself. Those who are not afraid may look it up, but quantum treatment of Laplace–Runge– Lenz vector is not very common even in the wild prairies of the Internet.

Anyway, I can drop now the double-index notation in and dimensionless energy � and classify the latter with a single index—principal quantum number n. Taking into account Eq. 8.14 and introducing

n D jmax C l C 1; (8.15)

8.2 Eigenvalues and Eigenvectors 263

I find for the allowed energy values

En D QE�n D �Z 2

n2 QE D � Z

2e4�

32�2"2r" 2 0„2

1

n2 D �Eg

n2 (8.16)

where I introduced a separate notation Eg for the ground state energy. It is also useful sometimes to have this expression written in terms of the Bohr radius aB defined by Eq. 8.8:

En D � Z 2e2

8�"r"0aB

1

n2 : (8.17)

For pure hydrogen atom in vacuum Z D 1, "r D 1, and taking into account that the ratio of the mass of a proton (the nucleus of a hydrogen atom is a single proton) to the mass of the electron is approximately mp=me � 1:8 � 104, I can replace the reduced mass � with the electron mass. In this case, the numerical coefficient in front of 1=n2 contains only universal constants and can be computed once and for all. It defines the so-called Rydberg unit of energy 1Ry, which in electron volts is approximately equal to 13:6 eV. This is one of those numbers which is actually worth remembering, just like a few first digits of number � . The physical meaning of this number has several interpretations. First of all, it is the ground state energy of the hydrogen atom, but taking into account that transition from the discrete energy levels to the continuous spectrum (ionization of atom) amounts to raising of the energy above zero, you can also interpret this value as the binding energy or ionization energy of a hydrogen atom—a work required to change the electron’s energy from the ground state to zero. Also, this number fixes the scale of atomic energies in general. Transition to atoms heavier than hydrogen, which are characterized by larger atomic numbers, makes the ground state energy more negative increasing the binding energy of the atom, which of course totally makes sense (atoms with larger charge attract electrons stronger).

If you apply Eq. 8.16 to excitons in semiconductors, it will yield a very different energy scale. It happens for several reasons. First, masses of the interacting positive and negative charges forming an exciton are comparable in magnitude, so one does need to compute the reduced mass. Second, these masses are often by an order of magnitude smaller than the mass of the free electron, which results in significant decrease of the binding energy. This decrease is further enhanced by a relatively large dielectric constant of semiconductors "r. All these factors taken together result in a much larger ground state energy of excitons (remember the energy is negative!) with much smaller ionization or binding energy, which varies across different semiconductors and can take values from of the order of 10�3 to 10�2 eV.

Finally, I would like to point out the fact that the discrete energy levels of hydrogen-like atoms occupy the final spectral region between the ground state and zero. Despite this fact, the number of these levels is infinite, unlike, for instance, in the case of one-dimensional square potential well. This means that with increasing principal quantum number n, the separation between the adjacent levels becomes

264 8 Hydrogen Atom

smaller and smaller, and at some point the discreteness of the energy becomes unrecognizable, even though the probability to observe the electron infinitely far from the nucleus is still zero. One can think about this phenomenon as approaching a classical limit, in which an electron’s motion is finite, it is still bound to the nucleus, but quantum effects become negligibly small.

For each value of n, there are several combinations of jmax and l satisfying Eq. 8.15, which means that all energy levels, except of the ground state, are degenerate with several wave functions belonging to the same energy eigenvalue that differ from each other by values of l and m. The total degree of degeneracy is easy to compute taking into account that for each n, there are n � 1 values of l (l obeys obvious inequality l < n), and for each l, there are 2l C 1 possible values of m. The total number of wave functions corresponding to the same value of energy is, therefore, given by

n�1X lD0 .2l C 1/ D 2.n � 1/ n

2 C n D n2: (8.18)

As expected, this formula yields a single state for n D 1 for which jmax D 0 and l D m D 0. For the energy level with n D 2, Eq. 8.18 predicts the existence of four states, which we easily recognize as one state l D m D 0 (in which case jmax D 1) and three more characterized by l D 1; m D �1; l D 1; m D 0; and l D 1; m D 1. For all of them, the maximum power of the polynomial function in the solution is jmax D 0. For some murky and not very important historical reasons, l D 0 states are called s-states, l D 1 are called p-states, l D 2 are d-states, and, finally, letter f is reserved for l D 3 states. The origin of these nomenclature comes from German words for various types of optical spectra associated with each of these states, but I am not going into this issue any further. Those who are interested are welcome to google it.

Replacing all dimensionless variables with physical radial coordinate r D �naB=Z, I find that the radial wave function Rn;l D un;l=r with fixed values of n and l is a product of .Zr=naB/

l, exponential function exp .�Zr=naB/, and a polynomial vn;l .Zr=naB/ of the order n � l �1. The polynomials, which emerge in this problem, are well known in mathematical physics as associated Laguerre polynomials defined by two indexes as Lpq�p.x/: The definition of these polynomials can be found in many textbooks as well as online, but for your convenience, I will provide it here as well. The definition is somewhat cumbersome and involves an additional polynomial called simply a Laguerre polynomial (no associate here):

Lq.x/ D ex

d

dx

q .e�xxq/ : (8.19)

To define the associated Laguerre polynomial, one needs to carry out some additional differentiation of the simply Laguerre polynomial:

8.2 Eigenvalues and Eigenvectors 265

Lpq�p.x/ D .�1/p

d

dx

p Lq.x/: (8.20)

It is quite obvious that index q in Eq. 8.19 specifies the degree of the respective polynomial (exponential functions obviously cancel each other after differentiation is performed). At the same time, index q � p in Eq. 8.20 specifies the ultimate degree of the associated polynomial (differentiating p times a polynomial of degree q will reduce it exactly by this amount). So, in terms of these functions, the polynomial appearing in the hydrogen model can be written as vn;l .Zr=naB/ D L2lC1n�l�1 .2Zr=naB/. The total normalized radial wave function Rn;l.r/ can be shown to be

Rn;l; .r/ D s

2Z

naB

3 .n � l � 1/Š 2n .n C l/Š exp

� Zr

naB

2Zr

naB

l L2lC1n�l�1

2Zr

naB

:

(8.21)

I surely hope you are impressed by the complexity of this expression and can appreciate the amount of labor that went into finding the normalization coefficient here, which you are given as a gift. I also have to warn you that different authors may use different definitions of the Laguerre polynomials, which affect the appearance of Eq. 8.21. More specifically, one might include an extra factor 1=.n C l/Š either in the definition of the polynomial or in the normalization factor. Equation 8.21 is written according to the former convention, while if the latter is accepted, the term .n C l/Š must be replaced with Œ.n C l/Š�3 : You might find both versions of the hydrogen wave function on the Internet or in the literature, and my choice was completely determined by the convention adapted by the popular computational platform MATHEMATICA ©, which I am using a lot to perform computations needed for this book. The total hydrogen wave function, which in the abstract notation can be presented as jn; l;mi, is obtained by multiplying the radial function and the spherical harmonics Yl;m .�; '/:

n;l;m .r; �; '/ D Rn;l; .r/ Yl;m .�; '/ : (8.22)

Different factors in Eqs. 8.22 and 8.21 are responsible for different physical effects; however, before giving them any useful interpretation, I have to remind you that the respective probability distribution density is given by

P .r; �/ D j n;l;m .r; �; '/j2 r2 sin � D 2Z

naB

3 .n � l � 1/Š 2n Œ.n C 1/Š� exp

�2Zr

naB

2Zr

naB

2l r2�

L2lC1n�l�1

2Zr

naB

�2 � Pml .cos �/

�2 sin � (8.23)

266 8 Hydrogen Atom

where I replaced spherical harmonics by the product of associated Legendre functions Pml .cos �/ and exp .im'/ and took into account that the latter disappears after multiplication by the respective complex-conjugated expression. Additional factors r2 and sin � are due to spherical volume element, which has the form of dV D r2 sin �d�d'dr.

The exponential factor describes how fast the wave function decreases at infinity. The respective characteristic scale

rat D naB Z

(8.24)

can be interpreted (quite loosely though) as a size of the atom, because the probability to find the electron at distance r rat becomes exponentially small. In the case of the atom in the ground state (n D 1 ), it is easy to show (i.e., if you remember that a maximum of a function is given by a zero of its first derivative) that the distance r D rat corresponds to the maximum of the probability P .r/ D ´ �

0 P.r; �/d� . In the case of a hydrogen atom (Z D 1/, rat D aB, and if this

atom is in vacuum ("r D 1), the Bohr radius is determined by fundamental constants only. Replacing the reduced mass with the mass of the electron, you can find that the size of the hydrogen atom in the ground state is aB � 0:5 � 10�10 m. This number sets up the atomic spatial scale just as 13:6 eV sets up the typical energy scale. In the case of excitons in semiconductors, the characteristic scale becomes much larger for the same reasons why the energy scale becomes smaller: large dielectric constant and smaller masses yield larger aB; see Eq. 8.8. As a result the typical size of the exciton can be as large as 10�8 m, which is extremely important for semiconductor physics, as it allows significant simplification of the quantum description of excitons.

Radial distribution for higher lying states can have several maximums, so such a direct interpretation of rat becomes impossible, but it still can be thought of as a cutoff distance, starting from which the probability for the electron to wander off dramatically decreases. It is interesting that this parameter increases with n, so excited atoms not only have more energy, but they are also larger in size. Figure 8.2 presents a number of radial functions for your perusal, which illustrates most of the properties discussed here.

The factors containing the power of the radial coordinate are responsible for electrons not falling on the nucleus—the probability that r D 0 is strictly zero. This probability for r < rat decreases with increasing angular momentum number l, which can be interpreted as a manifestation of the “centrifugal” force keeping rotating particles away from the center of rotation. The Laguerre polynomial factor is essentially responsible for the behavior of the radial wave function at the intermediate distances between zero and rat: the degree of the respective polynomial determines how many zeroes the radial wave function has. Finally, the Legendre function Pml .cos �/ is responsible for directionality of the probability distribution with respect to Z-axis. States with zero angular momentum are described by a completely isotropic wave function, which does not depend on direction at all. For

8.3 Virial and Feynman–Hellmann Theorems and Expectation Values of the. . . 267

Fig. 8.2 A few radial functions with different values of principal and orbital numbers n and l. All graphs on the left figure correspond to l D 0 and increasing n between 1 and 3. The number of zeroes of the functions is equal to n � l � 1. The graphs in the right figure correspond to n D 3; l D 1; n D 3; l D 2; and n D 4; l D 1 (which one is which you can figure out yourselves by counting zeroes). The functions are not normalized for convenience of display

states with non-zero l, an important parameter is l � m; which yields the number of zeroes of the Legendre function and can also be used to determine the number of the respective maximums. The properties of the Legendre functions have been already discussed in Sect. 5.1.4, and the plots illustrating them were presented in Fig. 5.3, which you might want to consult to refresh your memory.

8.3 Virial and Feynman–Hellmann Theorems and Expectation Values of the Radial Coordinate in a Hydrogen Atom

I will finish the chapter by discussing one apparently very special and technical but at the same time practically very important problem of calculating the expectation values hrpi of various powers of the radial coordinate rp, where p can be any negative or positive integer, in the stationary states of a hydrogen atom. Formally, calculation of these expectation values involves evaluation of the integrals

hrpi D 1̂

0

drrpC2 ŒRnl.r/�2 dr (8.25)

where Rnl.r/ has been defined in Eq. 8.21 and the extra 2 in rpC2 comes from the term r2 in the probability distribution generated by the hydrogen-like wave function, Eq. 8.23. Direct calculation of the integral in Eq. 8.25 is a hopeless task given the complexity of the radial function, but it is possible to circumvent the problem by relying on the radial equation, Eq. 8.7, itself, rather than on the explicit form of its solution, Eq. 8.21.

268 8 Hydrogen Atom

But first, let me derive a remarkable relation between the expectation values of kinetic and potential energies of a quantum particle known as a virial theorem. Consider the expectation value of the operator Or � Op in an arbitrary quantum state and compute its time derivative using the Heisenberg picture of the quantum mechanics (the expectation values do not really depend on which picture is used, but working with time-dependent Heisenberg operators and time-independent states is more convenient than using the Ehrenfest theorem, Eq. 4.17, for Schrödinger operators):

d

dt hOr � Opi D

� dOr dt

� Op �

C � Or � d Op

dt

� :

Applying Heisenberg equations for the position and momentum operators, Eqs. 4.28 and 4.29, to this expression, I obtain

d

dt hOr � Opi D

� Op m

� Op �

� D Or � r OV

E D 2

D OK E

� D Or � r OV

E : (8.26)

The left-hand side of Eq. 8.26 must vanish if the state used to compute the expectation value is an eigenvector of the Hamiltonian (a stationary state in the Schrödinger picture) because the expectation value of any operator in a stationary state is time-independent. This allows me to conclude that in the stationary states, the expectation values of kinetic and potential energies satisfy the relation

2 D OK E

D D Or � r OV

E (8.27)

known as virial theorem. In the case of the Coulomb potential of the hydrogen atom Hamiltonian, this theorem yields

2 D OK E

D Ze 2

4�"r"0

� 1

r

� : (8.28)

Since the expectation value of the Hamiltonian in its own stationary state is simply equal to the respective eigenvalue, I can write for the hydrogen-like Hamiltonian:

D OH E

D D OK E

� Ze 2

4�"r"0

� 1

r

� )

En D � Ze 2

8�"r"0

� 1

r

�

where I replaced the expectation value of the Hamiltonian with its eigenvalue for the n-th stationary state and used Eq. 8.28 to eliminate the expectation value of the kinetic energy. Finally, using Eq. 8.16 for En; I have

8.3 Virial and Feynman–Hellmann Theorems and Expectation Values of the. . . 269

Z2e4�

32�2"2r" 2 0„2

1

n2 D Ze

2

8�"r"0

� 1

r

� )

� 1

r

� D Ze

2�

4�"r"0„2 1

n2 D Z

aBn2 (8.29)

where in the last step I used Eq. 8.8 for Bohr radius aB. The expectation values hrpi for almost all other values of p can be derived using so-called Kramers’ recursion relations, which I provide here without proof:

p C 1 n2

hrpi � .2p C 1/ aB Z

˝ rp�1

˛C pa 2 B

4Z2

h .2l C 1/2 � p2

i ˝ rp�2

˛ D 0: (8.30)

It is easy to see that I can indeed use Eqs. 8.29 and 8.30 to find hrpi for any positive p; but Kramers’ relations fail to yield

˝ r�2 ˛ , because this term could arise if you set

p D 0, but, unfortunately, the corresponding term vanishes because of the factor p in it. Therefore, I have to find an independent way of computing

˝ r�2 ˛ . Luckily,

there exists a cool theorem, which Richard Feynman derived while working on his undergraduate thesis, called Feynman–Hellmann theorem.1 The derivation of this theorem is based on the obvious identity, which is valid for an arbitrary Hamiltonian and which I have already mentioned when deriving Eq. 8.29. To reiterate, the identity states that

En D h nj OH j ni

if j ni are the eigenvectors of OH. Now assume that the Hamiltonian OH depends on some parameter �. It can be, for instance, a mass of a particle, or its charge, or something else. It is obvious then that the eigenvalues and the eigenvectors also depend on the same parameter. Differentiating this identity with respect to this parameter, you get

@En @�

D � @ n

@�

ˇ̌ ˇ̌ OH j ni C h nj OH

ˇ̌ ˇ̌@ n @�

� C h nj @

OH @�

j ni :

The first two terms in this expression can be transformed as

� @ n

@�

ˇ̌ ˇ̌ OH j ni C h nj OH

ˇ̌ ˇ̌@ n @�

� D En

� @ n

@� j ni C h n

ˇ̌ ˇ̌@ n @�

� D

En @ h nj ni

@� D 0

1Hellmann derived this theorem 4 years before Feynman but published it in an obscure Russian journal, so it remained unknown until Feynman rediscovered it.

270 8 Hydrogen Atom

where I used the fact that all eigenvectors are normalized to unity, so that their norm, appearing in the last line of the above derivation, is just a constant. Thus, here is the statement of the Feynman–Hellmann theorem:

@En @�

D h nj @ OH @�

j ni : (8.31)

This is a very simple, almost trivial result, and it is quite amazing that it can be used to solve rather complicated problems, such as finding the expectation value

˝ r�2 ˛

in the hydrogen atom problem. So, let’s see how this is achieved. Going back to Eq. 8.7, you can recognize that this equation can be seen as an eigenvalue equation for Hamiltonian

OHr D � „ 2

2�r2 d

dr

r2 @

@r

C „

2l.l C 1/ 2�r2

� 1 4�"r"0

Ze2

r (8.32)

and that hydrogen energies are eigenvalues of this Hamiltonian. Therefore, I can apply the Feynman–Hellmann theorem to this Hamiltonian, choosing, for instance, the orbital quantum number l as a parameter �. Differentiation of Eq. 8.32 with respect to l yields

@ OHr @l

D „ 2.2l C 1/ 2�r2

:

In order to find derivative @[email protected], one needs to recall that the principal quantum number is related to the orbital number as n D l C nr C 1 so that

@En @l

D @En @n

D Z 2e2

4�"r"0aB

1

n3 :

Now, applying the Feynman–Hellmann theorem, I can write

„2.2l C 1/ 2�

� 1

r2

� D Z

2e2

4�"r"0aB

1

n3

where I used Eq. 8.17 for the energy. Rearranging this result and applying Eq. 8.8 for the Bohr radius, I obtain the final expression for

˝ r�2 ˛ :

� 1

r2

� D Z

2e2�

2�„2"r"0aB 1

.2l C 1/ n3 D 2Z2

a2B

1

.2l C 1/ n3 : (8.33)

Now, boys and girls, if what you have just witnessed is not a piece of pure magic with the Feynman–Hellmann theorem working as a magic wand, I do not know what else you would call it. And if you are not able to appreciate the awesomeness of this derivation, you probably shouldn’t be studying quantum mechanics or physics at

8.4 Problems 271

all for that matter. This result is also a key to finding, with the help of Kramers’ relations, Eq. 8.30, of the expectation values hrpi for any p: For instance, to find˝ r�3 ˛ , you just need to use Eq. 8.30 with p D �1:

aB Z

˝ r�2 ˛ � a

2 B

4Z2

h .2l C 1/2 � 1

i ˝ r�3 ˛ D 0 )

� 1

r3

� D 4Z

aB

1

.2l C 1/2 � 1 � 1

r2

� D

Z

aB

3 2

l .l C 1/ .2l C 1/ n3 : (8.34)

If the sheer wonder at our ability to compute hrpi without using the unseemly Laguerre polynomials is not a sufficient justification for you to vindicate spending some time doing these calculations, you will have to wait till Chap. 14, where I will put this result to actual use in understanding the fine structure of the spectra of hydrogen-like atoms.

8.4 Problems

Problem 103 Using Eqs. 8.4 and 8.5 together with canonical commutation relations for single-particle coordinates and momentums, derive the commutator between relative position vector r and corresponding momentum pr to convince yourself that these variables, indeed, obey the canonical commutation relations.

Problem 104 Verify that Eq. 8.10 defines a quantity of the dimension of energy.

Problem 105

1. Derive Eq. 8.14 by applying the power series method to Eq. 8.12 and carrying out the procedure outlined in the text.

2. Find all radial functions with n D 1 and n D 2. Normalize them. Problem 106 Using the definition of the associate Laguerre functions provided in the text, find explicit expressions for the radial functions corresponding to the states considered in the previous problem. Normalize them and make sure that the results are identical to those obtained previously.

Problem 107 An operator of the dipole moment is defined as Od D eOr where e is the elementary charge and Or is the position operator of the electron in the hydrogen atom. A dipole moment of a transition is defined as a matrix element of this operator between initial and final states of a system: dnlm;n0l0m0 � hnlmj Od jn0l0m0i. Evaluate this dipole moment for the transitions between ground state of the atom and all degenerate states characterized by n D 2. Problem 108 Find the expectation values hri ; h1=ri, and ˝r2˛ for a hydrogen atom in j2; 1;mi state.

272 8 Hydrogen Atom

Problem 109 Using the results of the previous problem and full 3-D Schrödinger

equation with non-separated variables, find D Op2 E . Find a relation between the

expectation values of the potential and kinetic energies.

Problem 110 A hydrogen atom is prepared in an initial state:

.r; 0/ D 1p 2 . 2;1;1 .r; �; '/C 1;0;0 .r; �; '// :

Find the expectation value of the potential energy as a function of time.

Problem 111 Consider a hydrogen atom in a state described by the following wave function:

.r/ D R1;0.r/C az � p 2x

r R2;1.r/

where

Rn;l.r/ D .r=naB/lC1 exp

� r naB

L2lC1n�l�1 .2r=naB/ :

1. Rewrite this function in terms of normalized hydrogen wave functions. 2. Find the values of coefficient a that would make the entire function normalized. 3. If you measure OL2 and OLz, what values can you get and with what probabilities? 4. If you measure energy, which values are possible and what are their probabilities? 5. Find the probability that the measurement of the particle’s position will find it in

the direction specified by the polar angle � �44ı < � < 46ı: 6. Find the probability that the measurement of the particle’s position will find the

particle at a distance 0:5aB < r < aB from the nucleus.

Chapter 9 Spin 1/2

9.1 Introduction: Why Spin?

The model of a pure spin 1=2, detached from all other degrees of freedom of a particle, is one of the simplest in quantum mechanics. Yet, it defies our intuition and resists developing that pleasant sensation of being able to relate a new concept to something that we think we already know (or at least are used to thinking about). We call this feeling “intuitive understanding,” and it does play an important albeit mysterious role in our ability to use new concepts. The reason for this difficulty, of course, lies in the fact that spin is a purely quantum phenomenon with no reasonable way to model it on something that we know from classical physics. While the only known to me bulletproof remedy for this predicament is practice, I will try to somehow ease your pain by taking the time to develop the concept of spin and by providing empirical and theoretical arguments for its inevitability.

Experimentally spin manifests itself most directly via interaction between elec- trons and magnetic field and can be defined as an inherent property of electrons responsible for this interaction. This definition is akin to the definition of electric charge as a property responsible for electron’s interaction with the electric field or of mass as a characteristic determining electron’s acceleration under an action of a force. The substantial difference, of course, is that charge and mass are immutable scalar quantities, our views of which do not change when we transition from classi- cal to quantum theories of nature. The concept of spin, on the other hand, is purely quantum and embodies two distinct types of entities. First is a Hermitian vector operator, characterized by two distinct eigenvalues and corresponding eigenvectors, which specify the possible experimental outcomes when one attempts to measure spin. Second are the spinors—particular type of vectors subjected to the action of the spin operator and representing various spin states; they control the probability of one or another outcome of the measurement.

© Springer International Publishing AG, part of Springer Nature 2018 L.I. Deych, Advanced Undergraduate Quantum Mechanics, https://doi.org/10.1007/978-3-319-71550-6_9

273

274 9 Spin 1/2

To untangle the connections between spin, angular momentum, and magnetic interactions, let me begin with a simple example of a classical electron moving along a circular orbit of radius R with period T . Taken literally, this example does not make much sense, but it does produce surprisingly reasonable results, so it can be considered as a convenient and meaningful metaphor. So, imagine an observer placed at some point on the orbit and counting the number of times the electron passes by during some time t T . The number of “sightings” of the electron, n, is related to the duration of the experiment t and the period T as t D nT . The total amount of charge that passes by the observer is obviously q D ne D et=T , where e is the elementary charge. The amount of charge passed across per unit time is what we call the electric current, which can be found as I D q=t D et= .Tt/ D e=T . This crude trick replaced a circulating electron by a stationary electric current, which, of course, only makes sense if I spread the entire charge of the electron along its orbit by some kind of averaging procedure. But as I said, I am treating this model only as a metaphor. Accepting this metaphor, I can follow up by remembering that the interaction between a steady loop of current and a uniform magnetic field is described by the loop’s magnetic dipole moment, � defined as � D IAn, where A is the area of the loop and n is the unit vector normal to the plane of the loop with direction determined by the right-hand rule (do you remember the right-hand rule?). In the case of the orbiting electron, the loop area is A D �R2, so I have

�L D e

T �R2n D ev

2�R �R2n D emevR

2me n D � e

2me L (9.1)

where I (a) expressed period T in terms of the circumference 2�R and orbital velocity v: T D 2�R=v, (b) multiplied the numerator and the denominator of the resulting expression by electron’s mass me, and (c) recognized that mevRn is a vector, which is equal in magnitude and opposite in direction to the orbital momentum of the electron L. To figure out the “opposite” part of the last statement, recall that the magnetic moment is defined by the direction of the current—motion of the positive charges—while the charge of our orbiting electron is negative, and, therefore, it rotates in the direction opposite to the current. Equation 9.1 establishes the connection between the magnetic dipole moment of the electron and its orbital angular momentum.

The interaction between a classical magnetic dipole and a uniform magnetic field B can be described by a potential energy:

UB D ��L� B D e

2me L � B: (9.2)

According to this expression, the potential energy has a minimum when the magnetic dipole is oriented along the magnetic field and a maximum when they are oriented antiparallel to each other. For both these orientations, the torque on the dipole � D �L � B is zero, so these are two equilibrium positions, but while the former is the stable equilibrium, the latter is unstable. Equation 9.2 also establishes

9.1 Introduction: Why Spin? 275

the connection between the potential energy UB and electron’s angular momentum L, which is quite useful for transitioning to quantum description. Quantization in this case consists merely in promoting the components of the angular momentum to the status of the operators. This newly born operator OUB can now be added to the Hamiltonian OH0 describing the electron in the absence of the magnetic field to yield

OH D OH0 C e 2me

B � OL: (9.3)

OH0, for instance, can describe an electron moving in some central potential V.r/ (the Coulomb potential would be a good example), and I will assume that its eigenvalues En;l and eigenvectors jn; l;mi are known. The choice of notation here reflects the fact that the eigenvectors of the Hamiltonian with central potential must also be the eigenvectors of angular momentum operators OL2 and OLz and that its eigenvalues do not depend on magnetic quantum number m.

It is quite easy to verify that if I choose the polar (Z)-axis of the coordinate system in the direction of the uniform magnetic field B, eigenvectors jn; l;mi of OH0 remain also eigenvectors of the total Hamiltonian given by Eq. 9.3. The corresponding eigenvalues are found as

OH0 C eB

2me Lz

jn; l;mi D En;l jn; l;mi C eB

2me „m jn; l;mi D

En;l C „ eB

2me m

jn; l;mi : (9.4)

The combination of fundamental constant e„=2me has a dimension of magnetic dipole moment and is prominent enough to warrant giving it its own name. Bohr magneton �B is defined as

�B D e„ 2me

(9.5)

so that the expression for the energy eigenvalues can be written down as

EZn;l;m D En;l C m�BB: (9.6)

Term m�BB can be interpreted as the energy of interaction between the uniform magnetic field and a quantized magnetic moment with values which are multiples of �B. In this sense, the Bohr magneton can be thought of as a quantum of magnetic dipole moment. The most remarkable prediction of this simple compu- tation is the m-dependence of the resulting energy levels, which is responsible for lifting the original 2l C 1 degeneracy of the energy eigenvectors. Since magnetic field is the primary reason for this, it seems quite natural to give quantum number m the name of “magnetic” number.

276 9 Spin 1/2

Experimentally, this degeneracy lifting is observed via the Zeeman effect— splitting of the absorption or emission spectral lines in the presence of the magnetic field. I will discuss the relation between the absorption/emission of light and atomic energy levels in more detail in Part III of the book, but at this point, it is sufficient to recall old Bohr’s postulates, one of which relates frequencies of the absorbed or emitted light to atomic energy levels:

!˛;ˇ D E˛ � Eˇ„ ;

where ˛; ˇ are composite indexes replacing groups of n; l;m for the sake of simplicity of notation. So, if you, say, observe a light emission due to the transition from the first excited state of hydrogen atom with n D 2 to the ground state, in the absence of magnetic field, you would see just one emission line formed by transitions between states j2; 0; 0i, j2; 1;�1i ; j2; 1; 0i, and j2; 1; 1i, all of which have the same energy, E2 D � QE=4, where QE was defined in Eq. 8.10. When the magnetic field is turned on, two of these states, j2; 1;�1i and j2; 1; 1i, acquire magnetic field-related corrections:

E2;�1 D � QE=4 � �BB E2;1 D � QE=4C �BB;

making their energy different from each other and E2;0: As a result, instead of a single emission line with frequency ! D .E2 � E1/ =„ D 3 QE=4„, an experimentalist would observe three lines at frequencies:

!�1 D 3 QE=4„ � �B„ B

!0 D 3 QE=4„ !1 D 3 QE=4„ C �B„ B:

You should not think though that by deriving Eq. 9.4, I completely solved the Zeeman effect. The actual problem is much more complicated and involves addition of orbital and spin magnetic moments, as well as multi-electron effects, relativistic corrections, magnetic moment of nuclei, etc. What I did was just an illustration designed to make a particular point—the magnetic field lifting of the 2l C 1 degeneracy of atomic levels gives rise to the odd number of closely positioned spectral lines. While for some atoms the odd number of lines is indeed observed, a large number of other observations manifest splitting into even number of lines. This phenomenon, called anomalous Zeeman effect, cannot be explained by interaction with orbital magnetic moment, because an even number of lines implies half-integer values of l. To explain this effect, we have to admit that in addition to “normal” orbital angular momentum, electrons also have another magnetic moment, which

9.1 Introduction: Why Spin? 277

cannot be constructed from the coordinate and momentum operators and has to be, therefore, an intrinsic property of the electron not related to other regular (spatial– temporal) observables. The lowest number of observed split lines was equal to two. Equating 2l C 1 to 2, you find that this splitting corresponds to l D 1=2. If you also recall that l is the maximum value of the magnetic number m, you might realize that m in this case can have only two values m D ˙1=2.

A meticulous and mischievous reader might ask, of course, if it is absolutely necessary to derive a magnetic dipole momentum from an angular momentum. Can a magnetic momentum exist just by itself with no angular momentum attached to it? The answer to the first question is yes and to the second one is, obviously, no. To justify these answers, however, is not so easy, and the path toward realizing that electrons do possess an intrinsic angular momentum, which can be in one of two possible states, was a long one. Such physicists as Wolfgang Pauli (Austria– Switzerland–USA) and Arnold Sommerfeld (Germany) recognized very early that purely orbital state of electrons proposed in Bohr’s model of atoms could not explain all experimental data, which consistently indicated that the actual number of states is double of what Bohr’s model predicted. Pauli was writing about the “two-valuedness” of electrons in early 1925 as he needed it to explain the structure of atoms and formulate its famous Pauli exclusion principle. Later in 1925 two graduate students of Paul Ehrenfest from Leiden, the Netherlands, Goudsmit and Uhlenbeck, published a paper, in which they proposed that the required additional states come from intrinsic angular momentum of electrons due to their “spinning” on their own axis. They postulated that this new angular momentum of electrons, S, is related to its magnetic moment �s in a way similar to the relation between orbital momentum and orbital magnetic moment, but in order to fit experimental data, they had to multiply the Bohr magneton by 2:

�s D �2 � e

2me S D �2�B„ S: (9.7)

The idea appeared so ridiculous to many serious physicists (such as Lorentz) that the students almost withdrew their paper, but luckily for them (and for physics), it was too late, and the paper was published. Eventually, it was recognized that while it was indeed wrong to think that a point-like particle such as electron can actually spin about its axis (estimates of the required spinning speed would put it well above the speed of light), so this classical mechanistic interpretation had to go, the idea of the intrinsic angular momentum, which “just is” as one of the attributes of electrons, survived, committing the names of Goudsmit and Uhlenbeck to the history of physics. Ironically, this was the highest achievement of their lives: they both made decent careers in physics, moving to the USA and securing respectable professorial positions, but they have never did anything as significant as their almost withdrawn student paper on spin.

There are other purely theoretical arguments for understanding spin as a different kind of the angular momentum, but this discussion is for a different time and a different book. At this point let me just mention that if we want to be able to

278 9 Spin 1/2

add orbital angular momentum and spin angular momentum, which is absolutely necessary to explain a host of effects in atomic spectra, we must require that they both are described by objects of the same mathematical nature. This means that if the orbital momentum is described in quantum mechanics by three operator components OLx, OLy, and OLz of the angular momentum vector with commutation relations given by Eqs. 3.53–3.55, spin angular momentum must also be described by three operator components OSx, OSy, and OSz with the same commutation relations. Our calculations in Sect. 3.3.4 demonstrated that these commutation relations ensure that one of the operator components (usually it is chosen to be the z-component) and the operator of the square of the angular momentum OL2(or OS2) can have a common system of eigenvectors characterized by a set of two eigenvalues, „ml for the z-component and „2l.l C 1/ for the square operator, where ml can take values ml D �l;�l C 1; � � � ; l � 1; l and can be either integer or half-integer. The results of Sect. 5.1.4 indicated that orbital angular momentum can only be characterized by integer eigenvalues, but, as you can see, half-integer values are needed to deal with the spin angular momentum. It is amusing to think that nature tends to find use for everything, which appears in abstract mathematical theories! To distinguish between spin and orbital moments, I will replace notation l for the maximum eigenvalue of operator OLz with notation s for the maximum eigenvalue of OSz. The lowest value that s can take is 1=2; which means that there are only two possible eigenvalues of this operator, �„=2 and „=2. The eigenvalue of the operator OS2 is „2s .s C 1/ D 3„2=4, but it is the value of s that we have in mind when we are talking about electron having spin 1/2. Thus, Pauli’s two-valuedness of the electron comes here in the form of two eigenvectors and two eigenvalues of the z-component of the spin operator. The idea that the spin is an intrinsic and immutable property of electrons means that the 1=2 value of quantum number s (or 3„2=4 eigenvalue of operator OS2) is as unchangeable as an electron’s mass or charge, but at the same time, the electron can be in various distinct spin states described by eigenvectors of OSz or their arbitrary superposition.

9.2 Spin 1/2 Operators and Spinors

While spin 1=2 operators are characterized by the same commutation relations as the operator of the orbital angular momentum, they act on vectors that live in a two-dimensional space of spin states or spinors. There are no reasons to panic at the sound of the unfamiliar word. The term spinor is used to describe a specific class of abstract vectors, which have all the same properties as any other vectors belonging to a Hilbert space, only much simpler because the dimensionality of the spinor space is just 2. One can introduce a ket spinor j�i, its adjoint bra spinor h�j, and inner product of spinors h�j �0i, which has the same property as any other inner products h�j �0i D .h�0j �i/�. A basis in this space can be formed by two eigenvectors of operator OSz, for which physicists use several, different in appearance, but otherwise equivalent notations. Two of the popular ways to designate these

9.2 Spin 1/2 Operators and Spinors 279

eigenvectors are j1=2i for the state belonging to the eigenvalue „=2 and j�1=2i for its counterpart accompanying eigenvalue �„=2. Alternatively, states with the positive eigenvalue are often called spin-up states with corresponding notation j"i, while states with the negative eigenvalue are called spin-down states and are notated as j#i. The main difference between spinors and vectors representing other states of quantum systems is that the spinors do not have the coordinate representation. They exist separately from the vector spaces formed by the eigenvectors of position or momentum operators or any other observables related to them. Spinors describe intrinsic properties of electrons, while vectors from other spaces represent their extrinsic spatial–temporal states.

This basis of the eigenvectors of operator OSz can be used to construct a particular representation of spinors and spin operators—as I demonstrated about 100 pages ago in Sect. 5.2.3. Generic spinors in this basis are represented by 2 � 1 column vectors:

j�i D

a b

� D a

1

0

� C b

0

1

� ; (9.8)

where j"i D 1

0

� represents the spin-up or m D 1=2 eigenvector, while j#i D

0

1

� represents the spin-down or m D �1=2 eigenvector. The representation of the

respective bra vector is given by

h�j D �a� b�� D a� �1 0�C b� �0 1� ; (9.9)

and the norm is

h�j �i D a�a C b�b: (9.10) Normalized spinors obviously obey the condition

jaj2 C jbj2 D 1: (9.11) Spin operators OSx, OSy, and OSz defined with respect to a particular Cartesian coordinate system are represented in the basis of the eigenvectors of OSz by two-by-two matrices derived in Sect. 5.2.3:

OSx D „ 2

0 1

1 0

� (9.12)

OSy D „ 2

0 �i i 0

� (9.13)

OSz D „ 2

1 0

0 �1 � : (9.14)

280 9 Spin 1/2

Equations 9.12–9.14 are just a recapitulation of Eqs. 5.110 and 5.111 from Sect. 5.2.3, which I placed here for your convenience. Spin operators are often expressed in terms of so-called Pauli matrices O�x; O�y, and O�z defined as

O�x D 0 1

1 0

� (9.15)

O�y D 0 �i i 0

� (9.16)

O�z D 1 0

0 �1 � : (9.17)

These matrices have a number of important properties such as

O�2x D O�2y D O�2z D OI; (9.18)

which means that they are simultaneously Hermitian and unitary, and

O�x O�y C O�y O�x D 0 O�x O�z C O�z O�x D 0 (9.19) O�z O�y C O�y O�z D 0;

which is often expressed as an anticommutativity property. Pauli matrices are used quite often in quantum mechanics, so it makes sense to acquaint yourselves with their properties. For instance, one can prove that the property expressed by Eq. 9.18 is valid for any matrix of the form �n D O� � n, where n is an arbitrary unit vector and O� is a vector with components given by Pauli matrices. Using the presentation of the unit vector in spherical coordinates

nx D sin � cos' ny D sin � sin' (9.20)

nz D cos �;

where � and ' are polar and azimuthal angles defining the direction of n with respect to a particular system of Cartesian coordinate axis (see Fig. 9.1), you can derive for the matrix �n D sin � cos'�x C sin � sin'�y C cos ��z:

�n D

cos � sin �e�i' sin �ei' � cos �

� :

9.2 Spin 1/2 Operators and Spinors 281

Fig. 9.1 Unit vector in the Cartesian coordinate system

Z

X

Y

n q

j

Squaring it will get you

�2n D

cos � sin �e�i' sin �ei' � cos �

� cos � sin �e�i'

sin �ei' � cos � �

D

cos2 � C sin2 � cos � sin �e�i' � cos � sin �e�i' cos � sin �ei' � cos � sin �ei' cos2 � C sin2 �

� D

1 0

0 1

� :

This property makes the evaluation of various functions with Pauli matrices as arguments relatively easy. One of the popular examples is the exponential function exp .i O� � n/, which you will enjoy computing when you get to the problem section of this chapter.

To help you become more comfortable with spin operators, I will now consider a few examples.

Example 21 (Measurement of the y-Component of the Spin) Assume that a single unmovable electron is placed in the state described by the spin-up eigenvector of operator OSz. Using magnetic field directed along the Y-axis of the coordinate system, you are probing possible values of the y-component of the spin. What are these possible values and what are their probabilities?

Solution

As with any observable, possible results of its measurement are given by the eigenvalues of the respective operator. In this case this operator is OSy and you need to determine its eigenvalues. The answer is, of course, obvious (C„=2 and �„=2), but let’s play the game and compute it. Besides, along the way you will determine the eigenvectors, which you need to answer the probability question. So, the eigenvector equation is

282 9 Spin 1/2

„ 2

0 �i i 0

� a b

� D „ 2 �

a b

� ;

which produces a set of two linear equations

�ib D �a ia D �b: (9.21)

The condition for the existence of nontrivial solutions given by zero of the determinant

���� � i �i �

����

becomes �2�1 D 0, yielding �1;2 D ˙1. Thus, recalling factor „=2 that I prudently pulled out, you can conclude that the eigenvalues are, indeed, as predicted ˙„=2. The first eigenvector is found by substituting � D 1 to Eq. 9.21. This gives a D �ib, so that the respective eigenvector can be written as

ˇ̌„=2y ˛ D b

�i 1

� D 1p

2

�i 1

� (9.22)

where at the last step I normalized it requiring that 2 jbj2 D 1. Repeating this procedure with � D �1, I find

ˇ̌�„=2y ˛ D 1p

2

i 1

� : (9.23)

Taking into account that the initial state was j„=2zi D 1

0

� , I find that the

probabilities of the corresponding eigenvalues are

p˙„=2 D ˇ̌˝˙„=2y

ˇ̌ „=2zi ˇ̌2 D 1

2

ˇ̌ ˇ̌�˙i 1�

1

0

�ˇ̌ ˇ̌ 2

D 1 2 :

Not a huge surprise, really.

Example 22 (Measurement of the Arbitrary Directed Planar Spin) What if we want to measure a component of the spin along a direction not necessarily aligned with one of the coordinate axes? Let me consider an example in which the measured component of the spin is in the Y–Z plane at an angle � with the Z-axis and find possible outcomes and their probabilities assuming the same initial state as before.

9.2 Spin 1/2 Operators and Spinors 283

Solution

I can define the specified direction by a unit vector with y-component sin � and z- component cos � . Introducing unit vectors ey and ez along the respective axis, this vector can be conveniently presented as n D ey sin � C ez cos � . The component of the spin in the direction of n is given by a dot product OSn D OS�n D OSy sin �C OSz cos � . Using the matrix representation of the spin operators in the basis of the eigenvectors of OSz, Eqs. 9.12–9.14, I find for OSn

OSn D „ 2

sin �

0 �i i 0

� C „ 2

cos �

1 0

0 �1 �

D „ 2

cos � �i sin � i sin � � cos �

� :

The respective eigenvector equation becomes

cos � �i sin � i sin � � cos �

� a b

� D �

a b

� ;

and the equation for the eigenvalues takes the form

���� cos � � � �i sin �

i sin � � cos � � � ���� D � .cos � � �/ .cos � C �/ � sin2 � D �2 � 1 D 0:

I am not going to pretend that I am surprised that the eigenvalues are again ˙„=2 as what else can they be?

Equations for the eigenvectors can be written as

1. � D 1

a cos � � ib sin � D a �ib sin � D a .1 � cos �/

�2ib sin � 2

cos �

2 D 2a sin2 �

2

�ib cos � 2

D a sin � 2 :

There are, of course, multiple choices of the coefficients in this equation, but I want to make the final form of the eigenvector as symmetric as possible, so I will choose a D A cos �

2 and b D iA sin �

2 , which obviously satisfy the equation with

an arbitrary A. The latter can be found from the normalization condition jaj2 C jbj2 D 1, which obviously gives A D 1. Now, I can write the first eigenvector as

j„=2ni D

cos � 2

i sin � 2

� (9.24)

284 9 Spin 1/2

2. � D �1 a cos � � ib sin � D �a ib sin � D a .1C cos �/

2ib sin �

2 cos

�

2 D 2a cos2 �

2

ib sin �

2 D a cos �

2 :

Using the same trick as previously, I find this eigenvector to be

j�„=2ni D

sin � 2

�i cos � 2

� : (9.25)

The direction described by � D �=2 corresponds to unit vector n pointing in the direction of the Y-axis, reducing this example to the previous one. Naturally, you would expect the eigenvector found here to reduce to the respective eigenvectors from the previous example. However, by substituting � D �=2 into Eqs. 9.24 and 9.25, you find that the resulting vectors do not coincide with Eqs. 9.22 and 9.23. Did I do something wrong here? Not really, because it is easy to notice that the difference between the two results is a mere factor of i, and we know that multiplication of an eigenvector by i or by any other complex number of the form exp .i'/, where ' is an arbitrary real number, does not change a quantum state and has no observable consequences. Finally, the probabilities that the measurements of the spin will produce one of the found eigenvalues are

p„=2 D ˇ̌ ˇ̌�cos �

2 �i sin �

2

� 1 0

�ˇ̌ ˇ̌ 2

D cos2 � 2

p�„=2 D ˇ̌ ˇ̌�sin �

2 i cos �

2

� 1 0

�ˇ̌ ˇ̌ 2

D sin2 � 2 :

I can also use this result to find the expectation value of the operator OSn in the state 1

0

� . The probabilistic definition of the mean

P xipi, where xi is the value of the

variable and pi is its probability, yields

D OSn E

D .„=2/ cos2 � 2

� .„=2/ sin2 � 2

D .„=2/ cos �;

which is exactly the value you should have expected from a classical vector oriented along the Z-axis when computing its component in the direction of n: The same result is obtained by computing the expectation value using the operator definition:

9.2 Spin 1/2 Operators and Spinors 285

D OSn E

D h"zj OSn j"zi D „ 2

� 1 0 � cos � �i sin �

i sin � � cos � � 1

0

� D

„ 2

� 1 0 � cos �

i sin �

� D „ 2

cos �:

Example 23 (Measuring of the Z-Component in an Arbitrary Spinor State) You can also ask a question, what if the spin was prepared in a state presented by one of the eigenvectors of OSn, say, j„=2ni and we were measuring the z-component of the spin? What would be the probabilities of obtaining „=2 or �„=2 and the expectation value of OSz in this situation? Solution

The corresponding probabilities are given by the following expressions:

ˇ̌ ˇ̌�1 0�

cos �

2

i sin � 2

�ˇ̌ ˇ̌ 2

D cos2 � 2

ˇ̌ ˇ̌�0 1�

cos �

2

i sin � 2 0

�ˇ̌ ˇ̌ 2

D sin2 � 2

yielding exactly the same results. Obviously the expectation value will also be the same.

These examples were designed to prepare you to answer an important but rather confusing question. The concept of spin is supposed to represent a vector quantity existing in our regular physical three-dimensional space. At the same time, the quantum objects used to describe spin, operators, and spinors have little relation to this space. While spin operators do have three components, they are not regular vectors, and the question about the “direction” of a vector operator does not make much sense. Spinors, representing spin states, are objects existing in an abstract two- dimensional space. Thus, the question is how these objects are connected with the physical space in which all our measurement apparatuses live. One might attempt to deflect this question by saying that after taking the expectation values of the spin operators for a given spin state, we will end up with a regular vector, which will provide us with the information about the spin and its direction. I can counter this by saying that this information is very limited. Indeed, I can also compute the uncertainty of each spin component, which will also give me a regular vector. The problem is that in the most generic situation, the vector obtained from the expectation values and the vector obtained from the uncertainties do not have to have the same direction, making it difficult to come up with a reasonable interpretation of these results. One way to avoid this ambiguity is to focus on eigenvectors, in which case expectation values provide the complete description of the situation. You only need to figure out the connection between the spatial direction, spin operators and corresponding eigenvectors.

286 9 Spin 1/2

One way to answer this question is to do what we just did in the previous example: introduce a component of the spin operator in the direction of interest, find its eigenvectors, and analyze their connection to this direction. But I want to add a bit more intrigue to the issue and will use a different approach. Let me ask you this: what is the best way to write down a generic spinor? Equation 9.8, which does it by introducing two complex parameters, a and b, is too general and does not contain all the information available about even the most generic spin states. Indeed, two complex numbers contain four independent real parameters, which can be brought out explicitly by writing a and b in the exponential form: a D jaj exp .i a/ and b D jbj exp .i b/. I can do better than that and reduce the number of parameters to just two without making the spinor any less generic.

First, I am going to use the freedom of choice of the overall phase of the spinor. To this end I will multiply both a and b by exp Œ�i . b C 'a/ =2�, bringing the spinor in the following form:

j�i D jaj exp .�i'=2/

jbj exp .i'=2/ � ;

where ' D a � b, and there are only three parameters left to worry about. Obviously, this is not the only way to eliminate one of the phases, but this one presents the spinor in a rather symmetric form, and just like all physicists, I have a sweet spot for symmetry. Besides, frankly speaking, I know where I want to go and just taking you along for the ride. The normalization imposes additional condition on these parameters, telling me that I can use it to eliminate another one of them reducing the total number to just two. After a few seconds of staring at Eq. 9.11, it can descend upon you that this equation looks similar to the fundamental trigonometric identity cos2 x C sin2 x D 1 and that you can automatically satisfy the normalization condition by choosing jaj D cos .�=2/ and jbj D sin .�=2/, expressing both jaj and jbj in terms of a single parameter �=2. If you are asking why �=2 and not just � , you will have the answer in a few minutes, just keep reading. Now, as promised, I have the expression for the generic normalized spinor:

j�1i D

cos .�=2/ exp .�i'=2/ sin .�=2/ exp .i'=2/

� (9.26)

with only two parameters, � and '. The choice I made for jaj and jbj is not unique, and I can generate another spinor by assigning jaj D sin .�=2/ and jbj D � cos .�=2/:

j�2i D

sin .�=2/ exp .�i'=2/ � cos .�=2/ exp .i'=2/

� : (9.27)

It is easy to verify by computing h�1j �2i that these spinors are orthogonal (of course, I designed them with this particular goal in mind), and by generating matrices

9.2 Spin 1/2 Operators and Spinors 287

j�1i h�1j D

cos2 .�=2/ cos .�=2/ sin .�=2/ exp .�i'/ cos .�=2/ sin .�=2/ exp .i'/ sin2 .�=2/

�

and

j�2i h�2j D

sin2 .�=2/ � cos .�=2/ sin .�=2/ exp .�i'/ � cos .�=2/ sin .�=2/ exp .i'/ cos2 .�=2/

� ;

you can also check that

j�1i h�1j C j�2i h�2j D OI;

indicating that these two spinors form a complete set. (When trying to reproduce these calculations, do not forget complex conjugation when converting kets into respective bra vectors.)

Thus, with little efforts, I have constructed a complete set of two generic mutually orthogonal spinors characterized by parameters, which can be interpreted as angles, and this must mean something. The found representation of spinors establishes a one-to-one relationship between two-dimensional space of spin states and points on the surface of a regular three-dimensional sphere of unit radius (see Fig. 9.2). The points at the north and south poles of the sphere, characterized by � D 0 and � D � , describe the eigenvectors of OSz operators j"i and j#i, respectively (angle ' is not defined for these points, but it is not a problem because respective factors exp .�i'=2/ become in these cases simply insignificant phase factors). It is also easy to notice that the antipodal points lying on the opposite ends of an arbitrarily oriented diameter of the sphere correspond to two mutually perpendicular spin states. Indeed, spherical coordinates of the antipodal points are related to each other as �2 D � � �1; '2 D '1 C � . Substituting these expressions into Eq. 9.26, you will immediately obtain the spinor presented in Eq. 9.27. While performing this operation, you can appreciate the wisdom of using half-angles �=2 and '=2 in these expressions.

In order to further figure out the physical meaning of the mapping between spinors and directions in regular 3-D space, consider the same operator, OSn D OS � n; which I discussed in the preceding example, but with a unit vector n defining a generic direction characterized by the same angles �; ' as in Fig. 9.2. This is the same vector which I introduced in connection with Pauli matrices, Eq. 9.20, so that the operator OSn becomes

OSn D „ 2

O�n D „ 2

cos � sin �e�i'

sin �ei' � cos � � :

288 9 Spin 1/2

Fig. 9.2 The Bloch sphere: each point on the surface characterized by spherical coordinates �; ' corresponds to a particular spin state

ex

ey

ez

1

q

j

c

Now, let me apply this operator to the spinors presented in Eq. 9.26:

„ 2

cos � sin �e�i'

sin �ei' � cos � �

cos .�=2/ exp .�i'=2/ sin .�=2/ exp .i'=2/

� D

„ 2

cos � cos .�=2/ exp .�i'=2/C sin � sin .�=2/ exp .�i'=2/

sin � cos .�=2/ exp .i'=2/ � cos � sin .�=2/ exp .i'=2/ �

D

„ 2

cos .�=2/ exp .�i'=2/ �cos � C 2 sin2 .�=2/� sin .�=2/ exp .i'=2/

� 2 cos2 .�=2/ � cos ��

� D

„ 2

cos .�=2/ exp .�i'=2/ �2 cos2 .�=2/ � 1C 2 sin2 .�=2/� sin .�=2/ exp .i'=2/

� 2 cos2 .�=2/ � 2 cos2 .�=2/C 1�

� D

„ 2

cos .�=2/ exp .�i'=2/ sin .�=2/ exp .i'=2/

� :

Isn’t that nice? A generic spinor with arbitrarily introduced parameters � and ' turned out to be an eigenvector of an operator representing the component of the spin in the direction defined by these parameters. It probably would not come as a particularly great surprise now that the second eigenvector I conjured up is also an eigenvector of the same operator but corresponding to the second eigenvalue, namely, �„=2. (Check it out as an exercise. And by the way, did you notice that in the course of this computation, I used a couple of trigonometric identities such as cos x D 2 cos2 x=2 � 1 and sin x D 2 sin x=2 cos x=2?) This exercise allows us to give more substance to an already established connection between spinors and directions in physical space: each spinor parametrized as in Eq. 9.26 or 9.27 is an eigenvector of a component of the spin in the direction specified by parameters � and ' interpreted as spherical coordinates of the corresponding unit vector lying on the surface of the Bloch sphere. The measurement of the spin in this direction will yield definite results corresponding to the respective eigenvalue, so it can be interpreted as the direction of the spin for this particular spin state. It also makes

9.3 Dynamic of Spin in a Uniform Magnetic Field 289

sense that antipodal points on the Bloch sphere represent eigenvectors belonging to opposite eigenvalues of OSn. Finally, by now, I hope you have the answer to the question why I used half-angles in the definition of the spinors.

9.3 Dynamic of Spin in a Uniform Magnetic Field

A bound (for instance, by attraction to a nucleus) electron in a uniform magnetic field, the system used earlier to introduce the Zeeman effect, is also the simplest somewhat realistic physical model allowing one to study quantum dynamics of a pure spin. Assuming that the interaction between the spin and the magnetic field does not affect the orbital state of the electron, one can ignore energy associated with the latter and omit the atomic part of the Hamiltonian (remember, energy only matters when it changes, and if it does not, we can always make it equal to zero). The Hamiltonian of this system is obtained by dropping OH0 term from Eq. 9.3 and replacing orbital angular momentum OL with 2 OS, where factor 2 takes into account the empirically established modification of the connection between spin angular and magnetic momenta, Eq. 9.7. The resulting Hamiltonian takes the form

OH D 2�B„ OS � B: (9.28)

Note that magnetic field B is not an operator because it describes a classical magnetic field created by a source whose physics is outside of our consideration. Since the field is uniform, it makes sense to use its direction as one of the axes of the coordinate system, which I have to specify in order to be able to carry out subsequent calculations. It is customary to choose axis Z as the one which is codirected with the magnetic field, in which case Hamiltonian 9.28 significantly simplifies

OH D 2�BB„ OSz: (9.29)

In this section I will discuss the dynamics of spin described by this Hamiltonian using both Schrödinger and Heisenberg pictures of quantum mechanics.

9.3.1 Schrödinger Picture

In the Schrödinger picture, we always begin by establishing the eigenvalues and eigenvectors of the Hamiltonian. It is obvious that the eigenvectors of the Hamiltonian given by Eq. 9.29 coincide with those of operator OSz, which I will denote here as j"i with eigenvalue „=2 (spin-up) and j#i with eigenvalue �„=2

290 9 Spin 1/2

(spin-down). The respective eigenvalues of the Hamiltonian are quite obvious and are

E" D �BB E# D ��BB: (9.30)

A solution of the time-dependent Schrödinger equation for an arbitrary time- dependent spinor j�.t/i

�i„@ j�.t/i @t

D 2�BB„ OSz j�.t/i

can be presented as a linear combination of the stationary states of the Hamiltonian

j�.t/i D a exp

i �BB

„ t

j"i C b exp

�i�BB„ t

j#i (9.31)

where coefficients a and b are determined by the initial state of the spin

j�.0/i D a j"i C b j#i : (9.32)

Equation 9.31 essentially solves the problem of the dynamics of a single spin in a uniform magnetic field. It, however, does little to develop our intuition about the physical phenomena, which this solution describes. In a typical experimental situ- ation, one is rarely dealing with a single spin. Most frequently, an experimentalist would measure a signal from an ensemble of many spins, and if we can neglect any kind of interaction between them, as well as assume that all spins are in the same initial state,1 the experimental results can be described by finding the expectation values of the spin operators. So, let me compute these expectation values for the state described by Eq. 9.31.

To this end I will use the representation of a generic spinor in the form of Eq. 9.26 and rewrite coefficients a and b as

a D cos .�=2/ exp .�i'=2/ b D sin .�=2/ exp .i'=2/ : (9.33)

Substituting these expressions for a and b into Eq. 9.31 and using regular represen- tation of basis spinors j"i and j#i, you can find

1The assumption about the same initial state is the most difficult to realize experimentally and can be justified only at zero temperature.

9.3 Dynamic of Spin in a Uniform Magnetic Field 291

j�.t/i D 2 4cos .�=2/ exp .�i'=2/ exp

� i�BB„ t

�

sin .�=2/ exp .i'=2/ exp � �i�BB„ t

� 3 5 : (9.34)

It is easiest to compute the expectation value of OSz: D OSz E

D „ 2

� jaj2 � jbj2

� D „ 2

cos �: (9.35)

I derived this expression taking advantage of the fact that j"i and j#i in Eq. 9.31 are eigenvectors of OSz, and, therefore, coefficients in front of them (their absolute values squared, of course) determine the probabilities of the respective eigenvalues. To find the expectation values of two other components, I will have to do a little bit more work computing

D OSx;y E

D h�.t/j OSx;y j�.t/i :

I begin with the x-component and first compute the right half of this expression OSx j�.t/i:

OSx j�.t/i D „ 2

0 1

1 0

�2 4cos .�=2/ exp .�i'=2/ exp

� i�BB„ t

�

sin .�=2/ exp .i'=2/ exp � �i�BB„ t

� 3 5 D

„ 2

2 4sin .�=2/ exp .i'=2/ exp

� �i�BB„ t

�

cos .�=2/ exp .�i'=2/ exp �

i�BB„ t � 3 5 :

By the way, have you noticed how operator OSx flips the components of the spinor? Anyway, to complete this computation, I find the inner product of this ket with the bra version of spinor in Eq. 9.31:

D OSx E

D „ 2

cos .�=2/ sin .�=2/ exp .i'/ exp

�i2�BB„ t

C cos .�=2/ sin .�=2/ exp .�i'/ exp

i 2�BB

„ t �

D

„ 2

sin � cos

2�BB

„ t � ' :

Similar calculations with the y-component operator yield

D OSy E

D „ 2

sin � sin

2�BB

„ t � ' :

292 9 Spin 1/2

Let’s collect all these results together to get the better picture:

D OSz E

D „ 2

cos �

D OSx E

D „ 2

sin � cos

2�BB

„ t � '

D OSy E

D „ 2

sin � sin

2�BB

„ t � ' : (9.36)

Here is what we have: a vector of length „=2 remains at all times at angle � with respect to the magnetic field, but its projection on the X–Y plane of the coordinate system (which is perpendicular to the magnetic field) rotates with frequency !L D 2�BB=„ D eB=me, where I substituted Eq. 9.5 for the Bohr magneton. A remarkable fact about this result is the disappearance of Planck’s constant from the final expression for frequency, which signals that this phenomenon must exist in classical physics as well, and it, indeed, does. Equation 9.36 describes a very well-known effect—Larmor precession—which is observed every time when a magnetic moment (of any nature) interacts with a uniform magnetic field. However, the frequency of the precession might be different for different magnetic moments because of its dependence on the so-called gyromagnetic ratio defined as a coefficient of proportionality between the magnetic dipole moment and the angular momentum. For the orbital angular momentum, this ratio is �e= .2me/ as given by Eq. 9.1, while for the spin, it is two times larger, resulting in twice as big precession frequency.

9.3.2 Heisenberg Picture

To describe spin precession in the Heisenberg picture, I have to solve Heisenberg equations 4.24 for spin operators. To simplify notations I will omit subindex H, which I used to distinguish Schrödinger from Heisenberg operators. However, it is important to note that the angular momentum commutation relations, Eqs. 3.53– 3.55, remain the same for both pictures, provided that we take Heisenberg operators at the same time. If you do not see how to verify this statement, imagine

sandwiching both sides of the commutation relation between operators exp �

i OHt=„ �

and exp � �i OHt=„

� and also inserting the products of these operators (which is equal

to unity, by the way) between the products of any two operators in the commutator. Thus, using the necessary commutation relation, I obtain the following equations:

d OSz dt

D � i„!L h OSz; OSz

i D 0 (9.37)

9.3 Dynamic of Spin in a Uniform Magnetic Field 293

d OSx dt

D � i„!L h OSx; OSz

i D �!L OSy (9.38)

d OSy dt

D � i„!L h OSy; OSz

i D !L OSx; (9.39)

where I introduced the Larmor frequency defined in the previous section to the Hamiltonian. Differentiating Eqs. 9.38 and 9.39 with respect to time, I can separate them into two independent differential equations of the second order:

d2 OSx dt2

D �!2L OSx d2 OSy dt2

D �!2L OSy with obvious solutions

OSx.t/ D OAx sin!Lt C OBx cos!Lt OSy.t/ D OAy sin!Lt C OBy cos!Lt:

Unknown constant operators OAx;y and OBx;y are determined by the initial conditions for the spin operators and their derivatives:

OBx D OSx.0/I OAx D 1 !L

d OSx dt

D � OSy.0/

OBy D OSy.0/I OAy D 1 !L

d OSy dt

D OSx.0/

where OSx;y .0/ coincide with the Schrödinger spin operators. Thus, I have OSx.t/ D � OSy.0/ sin!Lt C OSx.0/ cos!Lt (9.40) OSy.t/ D OSx.0/ sin!Lt C OSy.0/ cos!Lt: (9.41)

All that is left to do is to compute the expectation values of the Schrödinger spin operators in the initial state given by Eqs. 9.32 and 9.33. However, I do not have to repeat these calculations as we can just read them off Eq. 9.36 at t D 0: This yields

D OSx.t/ E

D „ 2

sin � cos' cos!Lt C „ 2

sin � sin' sin!Lt D „ 2

sin � cos .!Lt � '/ D OSy.t/

E D „ 2

sin � cos' sin!Lt � „ 2

sin � sin' cos!Lt D „ 2

sin � sin .!Lt � '/

in complete agreement with the results obtained from the Schrödinger picture.

294 9 Spin 1/2

9.4 Spin of a Two-Electron System

9.4.1 Space of Two-Particle States

I will complete the discussion of the spin by considering a system of two electrons. The goal of this exercise is to figure out if it makes sense to talk about a total spin of this system understood as a some kind of the sum of two individual spins OS1C OS2. In classical physics that would have been a trivial question—of course, we can define the total angular momentum of several particles—just add them up remembering that they are vectors. You can even derive a total angular momentum conservation law valid in the absence of external torques, just like you can derive a total linear momentum conservation law if the system of particles is not exposed to external forces. In quantum mechanics, when individual spins are presented by operators acting in different spaces containing spin states of each particle, the answer to this question is more complex. It is still affirmative: yes, it is possible to define the total spin of a system of two (or more particles) by introducing a new operator, which can be formally defined as

OS.tp/ D OS.1/ C OS.2/; (9.42)

where the upper index is an abbreviation of “two-particle.” However, so far Eq. 9.42 is a purely formal expression, in which even the meaning of the sign “C” is not clear. What I need to do now is to figure out the properties of OS.tp/ and their relation to the properties of OS.1/ and OS.2/, which is not a trivial task.

Operators are defined by their action on vectors, and, since vectors live in a certain vector space, the first step in defining an operator is to understand the space where vectors, on which the operator acts, live. Operators OS1 and OS2 operate on vectors that live in different and unrelated spaces: one acts on spin states of one particle and the other on the states of a completely different particle. I can, however, combine these spaces to form a new extended space, which would include spin states of both particles. To define such a space, all what I need is to define a basis in it, and then any other vector can be presented as a linear combination of the vectors of basis. The space containing spin states of each individual particle is defined by two basis vectors for each particle. These states are eigenvectors of operators OS.1/z and OS.2/z (obviously defined in the same coordinate system) and can be depicted symbolically in a few equivalent ways discussed in previous sections. Here I will use spin-up and spin-down notations indicated by the vertical arrows j"1;2i or j#1;2i, where subindexes 1 and 2 simply indicate a particle whose states these kets represent. In a system of two particles, there exist four different combinations of their spin states: both spins up, both spins down, the first spins up, the second spins down, and vice versa. You can create a notation for these states either by putting two state signifiers inside a single ket, like that j"1; #2i, or by sticking together two kets like this: j"1i j#2i. The difference between the two notations is superfluous, and either

9.4 Spin of a Two-Electron System 295

one can be used freely, while the second notation with two separate kets is slightly more convenient when one needs to write down matrix elements corresponding to operators acting on different particles. Thus, I will present the four basis vectors in a new four-dimensional space containing states of a two-spin system as

j1i � j"1i j"2i ; j2i � j"1i j#2i ; j3i � j#1i j"2i j4i � j#1i j#2i : (9.43)

Conversion from kets to bras occurs following all the standard rules of Hermitian conjugation applied to the states of both particles.

A larger space formed from two smaller spaces in the described manner is called in mathematics a tensor product of spaces. It has all the standard algebraic properties of a linear vector space discussed in Sect. 2.2, and I only need to add the distributive properties involving vectors belonging to different components of the tensor product:

.je1i C je2i/ jv1i � je1i jv1i C je2i jv1i je1i .jv1i C jv2i/ � je1i jv1i C je1i jv2i : (9.44)

The inner product in the tensor space is defined as

.he1j hv1j/ .je2i jv2i/ D he1j e2i hv1j v2i ; (9.45)

and it is quite obvious that this definition preserves the main property of the inner product, namely, that hˇ j˛i D h˛ jˇi�. In the case of the two-spin system, you can, for instance, find using the notation from Eq. 9.43:

h1j 1i D h"1j "1i h"2j "2i D 1

where it is presumed that vectors j"1;2i are normalized. You can also find that the inner products involving different vectors from the basis vanish such as

h1j 2i D h"1j "1i h"2j #2i D 0 h1j 3i D h"1j #1i h"2j "2i D 0:

In reality you have already encountered the tensor product of spaces earlier in this book, even though I never used the name. One example was the construction of states of a three-dimensional harmonic oscillator from the states of the one- dimensional oscillators.

To illustrate calculations of the inner product between vectors belonging to such a tensor product space, consider the following example.

Example 24 (Working with Vectors from a Tensor Product Space) Compute the norms of the following vectors as well as the inner product hˇ j˛i:

296 9 Spin 1/2

j˛i D .3i j"1i C 4 j#1i/ .2 j"2i � i j#2i/ jˇi D .2 j"1i � i j#1i/ .2 j"2i � 3 j#2i/ :

Solution

Since all the vectors in adjacent parenthetic expressions are kets and belong to different spaces, it is clear that I am dealing here with the tensor product of two spaces. Distribution properties, expressed by Eq. 9.44, allow me to convert these expressions into

j˛i D 6i j"1i j"2i � 4i j#1i j#2i C 3 j"1i j#2i C 8 j#1i j"2i jˇi D 4 j"1i j"2i C 3i j#1i j#2i � 6 j"1i j#2i � 2i j#1i j"2i :

Note that the order in which vectors belonging to different spaces are stacked together is completely irrelevant. Using the normalized and orthogonal basis introduced in Eq. 9.43, I can rewrite this expression as

j˛i D 6i j1i � 4i j4i C 3 j2i C 8 j3i jˇi D 4 j1i C 3i j4i � 6 j2i � 2i j3i :

Now I can compute the norms and the inner product following the standard procedure, which yields

k˛k D p36C 16C 9C 64 D p125 kˇk D p16C 9C 36C 4 D p65

hˇ j˛i D 4 � 6i � 4i .�3i/ � 3 � 6C 8 � .2i/ D �30C 40i:

Finally, I need to introduce the rule describing how spin operators act on the vectors in the tensor product space. The rule is actually very simple: the operators affect only the states of their own particles. To illustrate this rule, consider the following example.

Example 25 (Operator Action in Tensor Spaces) For the state j˛i from the previous example, compute

(a)

� OS.1/z C OS.2/z �

j˛i

(b)

� OS.1/C C OS.2/C �

j˛i

9.4 Spin of a Two-Electron System 297

Solution

(a)

� OS.1/z C OS.2/z � .6i j"1i j"2i � 4i j#1i j#2i C 3 j"1i j#2i C 8 j#1i j"2i/ D

6i OS.1/z j"1i j"2i C 6i j"1i OS.2/z j"2i � 4i OS.1/z j#1i j#2i � 4i j#1i OS.2/z j#2i C 3 OS.1/z j"1i j#2i C 3 j"1i OS.2/z j#2i C 8 OS.1/z j#1i j"2i C 8 j#1i OS.2/z j"2i D

3i„ j"1i j"2i C 3i„ j"1i j"2i C 2i„ j#1i j#2i C 2i„ j#1i j#2i C 3„ 2

j"1i j#2i � 3„ 2

j"1i j#2i � 4„ j#1i j"2i C 4„ j#1i j"2i D 6i„ j"1i j"2i C 4i„ j#1i j#2i

(b)

� OS.1/C C OS.2/C � .6i j"1i j"2i � 4i j#1i j#2i C 3 j"1i j#2i C 8 j#1i j"2i/ D

6i OS.1/C j"1i j"2i C 6i j"1i OS.2/C j"2i � 4i OS.1/C j#1i j#2i � 4i j#1i OS.2/C j#2i C 3 OS.1/C j"1i j#2i C 3 j"1i OS.2/C j#2i C 8 OS.1/C j#1i j"2i C 8 j#1i OS.2/C j"2i D

�4i„ j"1i j#2i � 4i„ j#1i j"2i C 3„ j"1i j"2i C 8„ j"1i j"2i D 11„ j"1i j"2i � 4i„ .j"1i j#2i C j#1i j"2i/

9.4.2 Operator of the Total Spin

The concept of the tensor product gives an exact mathematical meaning to Eq. 9.42 and the “plus” sign in it as illustrated by the previous example. Indeed, if each operator appearing on the right-hand side of this equation is defined to act in a common space of two-particle states, then the plus sign generates the regular operator sum as defined in earlier chapters of this book.

Now I can tackle the main question: what are the eigenvalues and eigenvectors of the components of the total spin operator defined by Eq. 9.42, and of its square� OStp

�2 D � OS.1/ C OS.2/

�2 ? When discussing any systems of operators, the first

question you must be concerned with is the commutation relations between these operators. The first commutator that needs to be dealt with is between operators OS.1/ and OS.2/, and it is quite obvious that any two components of these operators commute, i.e.,

h OS.1/i ; OS.2/j i

D 0

298 9 Spin 1/2

for all i; j taking values x; y, and z. Indeed, since OS.1/i only affects the states of the particle 1, and OS.2/i acts only on the states of particle 2, an order in which these operators are applied is irrelevant. Now it is quite easy to establish that all commutation relations for the components of operator OStp and its square are exactly the same as for any operator of angular momentum. This justifies the claim that there exists a system of vectors, jS;Mi, which are common eigenvectors of one of the components of OStp, usually chosen to be OStpz , and of the operator

� OStp �2

characterized

by two numbers M and S such that

OStpz jS;Mi D „M jS;Mi � OStp

�2 jS;Mi D „2S .S C 1/ jS;Mi :

It also can be claimed that jMj � S and that these numbers take integer of half- integer values. What is missing at this point is the information about the actual values that S can take and its relation to eigenvalues of the spin operators of the individual particles. Also, one would like to know about the connection between

eigenvectors of OStpz and � OStp

�2 and their single-particle counterparts. To answer

these questions, I am going to generate matrix representations of operators OStpz and� OStp �2

, using the basis vectors defined in Eq. 9.43. I will start with operator OStpz . The application of this operator to the basis vectors yields (see examples above)

OStpz j"1i j"2i D „ j"1i j"2i (9.46) OStpz j"1i j#2i D 0 (9.47) OStpz j#1i j"2i D 0 (9.48)

OStpz j#1i j#2i D �„ j#1i j#2i : (9.49) These results indicate that the basis vectors defined in Eq. 9.43 are also eigenvectors of the operator OStpz with eigenvalues ˙„ and a double-degenerate eigenvalue 0. Thus, the matrix of this operator in this basis is a diagonal 4 � 4 matrix:

OStpz D „

2 664

1 0 0 0

0 0 0 0

0 0 0 0

0 0 0 �1

3 775

where I have positioned the matrix elements in accord with the numeration of eigenvectors introduced in Eq. 9.43. For instance, the right-hand side of Eq. 9.46, where operator OStpz acts on the first of the basis vectors, represents the first column of the matrix, which contains a single non-zero element, the right-hand side of Eq. 9.47 yields the second column, where all elements are zeroes, and so on and so forth.

9.4 Spin of a Two-Electron System 299

Operator � OStp

�2 requires more work. First, let me rewrite it in terms of the

particle’s operators OS.1/ and OS.2/: � OStp

�2 D � OS.1/ C OS.2/

�2 D � OS.1/

�2 C � OS.2/

�2 C 2 OS.1/ � OS.2/ D

� OS.1/ �2

C � OS.2/

�2 C 2

� OS.1/x OS.2/x C OS.1/y OS.2/y C OS.1/z OS.2/z �

D � OS.1/

�2 C � OS.2/

�2 C 2 OS.1/z OS.2/z C

2

" OS.1/C C OS.1/� 2

OS.2/C C OS.2/� 2

C OS.1/C � OS.1/�

2i

OS.2/C � OS.2/� 2i

# D

� OS.1/ �2

C � OS.2/

�2 C 2 OS.1/z OS.2/z C OS.1/C OS.2/� C OS.1/� OS.2/C

where I replaced the x- and y-components of the spin operator by ladder operators defined in Eqs. 3.59 and 3.60 adapted for spin operators. The last expression is

perfectly suited for generating the matrix of � OStp

�2 . Applying this operator to each

of the basis vectors, I can again simply read out the columns of this matrix:

� OStp �2 j1i D

� OS.1/ �2

C � OS.2/

�2 C 2 OS.1/z OS.2/z C OS.1/C OS.2/� C OS.1/� OS.2/C

� j"1i j"2i D (9.50)

3

4 „2 j"1i j"2i C 3

4 „2 j"1i j"2i C 1

2 „2 j"1i j"2i D 2„2 j"1i j"2i � 2„2 j1i :

The ladder operators do not contribute to the final result because the raising operator applied to the spin-up vector yields zero. All other terms in this expression follow from the standard properties of the spin operators. Continue

� OStp �2 j2i D

� OS.1/ �2

C � OS.2/

�2 C 2 OS.1/z OS.2/z C OS.1/C OS.2/� C OS.1/� OS.2/C

� j"1i j#2i D

3

4 „2 j"1i j#2i C 3

4 „2 j"1i j#2i � 1

2 „2 j"1i j#2i C „2 j#1i j"2i D (9.51)

„2 j"1i j#2i C „2 j#1i j"2i � „2 j2i C „2 j3i

where the ladder operators in term OS.1/� OS.2/C are responsible for the non-zero contribution and where the spin of each particle becomes upside down. And again

300 9 Spin 1/2

� OStp �2 j3i D

� OS.1/ �2

C � OS.2/

�2 C 2 OS.1/z OS.2/z C OS.1/C OS.2/� C OS.1/� OS.2/C

� j#1i j"2i D

3

4 „2 j#1i j"2i C 3

4 „2 j#1i j"2i � 1

2 „2 j#1i j"2i C „2 j"1i j#2i D (9.52)

„2 j"1i j#2i C „2 j#1i j"2i � „2 j3i C „2 j2i

where the inversion of the spins in the last term is due to operators OS.1/C OS.2/� . Finally � OStp

�2 j4i D � OS.1/

�2 C � OS.2/

�2 C 2 OS.1/z OS.2/z C OS.1/C OS.2/� C OS.1/� OS.2/C

� j#1i j#2i D (9.53)

3

4 „2 j#1i j#2i C 3

4 „2 j#1i j#2i C 1

2 „2 j#1i j#2i D 2„2 j#1i j#2i � 2„2 j4i :

Reading out columns 1 through 4 from Eqs. 9.50 to 9.53 correspondingly, I generate the desired matrix:

� OStp �2

ij D „2

2 664

2 0 0 0

0 1 1 0

0 1 1 0

0 0 0 2

3 775 :

What is left now is to find its eigenvalues and eigenvectors, i.e., to solve the eigenvalue problem:

„2 2 664

2 0 0 0

0 1 1 0

0 1 1 0

0 0 0 2

3 775

2 664

a1 a2 a3 a4

3 775 D �„2

2 664

a1 a2 a3 a4

3 775 :

Two eigenvalues can be found just by looking at Eqs. 9.50 and 9.53, which indicate that vectors j1i and j4i are eigenvectors of this matrix with eigenvalues 2„2. This circumstance is reflected in the structure of the matrix, where the first and fourth rows as well as the first and fourth columns contain single non-zero elements. Such matrices are known as block-diagonal, and what makes them special is that each block can be considered independently of the others and treated accordingly. For instance, the equations for a1 and a4 will not contain any other coefficients, while equations for elements a2 and a3 will only contain these two elements. Since I already know that solutions with a1 D 1, a2;3;4 D 0 and a4 D 1, a1;2;3 D 0 are

9.4 Spin of a Two-Electron System 301

eigenvectors corresponding to � D 2„2, I only need to deal with the remaining two coefficients a2 and a3 satisfying equations

a2 C a3 D �a2 a2 C a3 D �a3:

It immediately follows from this system that either a2 D a3 or � D 0. In the former case, I have

� D 2;

while the latter one gives me

a2 D �a3:

Thus, I end up once again with eigenvalue 2„2, but now it belongs to the eigenvector 1p 2 .j2i C j3i/ D 1p

2 .j"1i j#2i C j#1i j"2i/

where I set a2 D a3 D 1= p 2 to make this vector normalized. I also got a new

eigenvalue equal to zero with eigenvector

1p 2 .j2i � j3i/ D 1p

2 .j"1i j#2i � j#1i j"2i/ :

Recalling that eigenvalues of the � OStp

�2 must have the form „2S.S C 1/, I can

immediately deduce that eigenvalue 2„2 corresponds to S D 1, while eigenvalue zero, obviously, corresponds to S D 0.

It is time to put all these results together. Here is what I have: a triple degenerate eigenvalue characterized by spin S D 1 and three eigenvectors

j1; 1i D j"1i j"2i

j1; 0i D 1p 2 .j"1i j#2i C j#1i j"2i/ (9.54)

j1;�1i D j#1i j#2i

and a single non-degenerate eigenvalue corresponding to S D 0 with eigenvector

j0; 0i D 1p 2 .j"1i j#2i � j#1i j"2i/ (9.55)

302 9 Spin 1/2

attached to it. Notations used for these eigenvectors follow the traditional scheme jS;Mi and reflect the facts that all three eigenvectors in Eq. 9.54 are simultaneously eigenvectors of operator OStpz with corresponding quantum numbers M D 1, M D 0, and M D �1, while a single eigenvector in Eq. 9.55 is also an eigenvector of OStpz corresponding to M D 0. You might want to pay attention to the fact that both superposition eigenvectors j2; 0i and j0; 0i are linear combinations of the eigen- vectors of OStpz established in Eqs. 9.47 and 9.48 belonging to a double-degenerate eigenvalue 0 of OStpz , which reflects the general notion that linear combinations of degenerate eigenvectors are also eigenvectors belonging to the same eigenvalue. The particular combinations appearing in Eqs. 9.54 and 9.55 ensure that these vectors are

simultaneously eigenvectors of the operator � OStp

�2 . The results presented in these

equations also reflect the general property of the angular momentum operators: the value of quantum number S determines the maximum and minimum allowed values of the second quantum number M and, respectively, the total number 2S C 1 of eigenvectors belonging to the given eigenvalue of

� OStp �2

. Indeed, for S D 1, we have three vectors with M ranging from �1 to 1, while for S D 0, there exists a single vector with M D 0. This situation is often described by saying that the system of two-spin 1=2 particles can be in two states characterized by the total spin equal to one or zero. The former is called a triplet state reflecting the existence of three distinct states with the same S and different magnetic numbers M, and the latter is called a singlet for obvious enough reasons. People also often say that in the triplet state, the spins of the particles are parallel to each other, while in the singlet state, they are antiparallel, but this is highly misleading. Even leaving aside the obvious quantum mechanical fact that the direction of spin in quantum mechanics is not defined because only one component of the vector can have a definite value in a given state, parallel or antiparallel can refer only to the sign of the z-component of the spin determined by the value of M. As we have just seen, this number can be equal to zero, reflecting the “antiparallel” orientation of the particle’s spins, when the particles are either in the S D 1 or S D 0 state. Therefore, more accurate verbal description of the situation (if you really need one) may sound like this: in the triplet spin states, the particle’s spins can be either parallel or antiparallel, while in the singlet state, they can only be antiparallel.

To complete this discussion, let me direct your attention to another interesting difference between triplet and singlet states. The former are symmetric with respect to the exchange of the particles, while the latter are antisymmetric. What it means is that if you replace particle indexes 1 and 2 in Eqs. 9.54 and 9.55 (exchange the particles one and two), the states described by the former equation do not change, while the singlet state described by the latter equation changes its sign. The operation of the particle’s exchange reflects the classical idea that you can somehow distinguish between the particles marking them as one and two and then swap them by placing particle one in the state of particle two and vice versa. In quantum mechanics two electrons are not really distinguishable, and, therefore, the swapping operation shouldn’t change the properties of the system. This topic will be discussed in much more detail in Chap. 11 devoted to quantum mechanics of many

9.5 Operator of Total Angular Momentum 303

identical particles. Here I just want to mention, giving you a brief preview of what is coming, that the symmetry and antisymmetry of the spin states of the two-particle system are a reflection of quantum indistinguishability of electrons.

9.5 Operator of Total Angular Momentum

9.5.1 Combining Orbital and Spin Degrees of Freedom

When discussing the model of spin 1=2 or addition of two such spins, I intentionally ignored the fact that the spin is “attached” to a particle, which can be involved in all kinds of crazy things such as being a part of an atom or rushing through a piece of metal delivering an electron current. At the same time, such phenomena as resonant tunneling or hydrogen atom in the previous chapters were treated with utter ignorance of the fact that in addition to “regular” observables, such as position or momentum, electrons also carry around their spin, which is as unalienable as their mass or charge. Now the time has come to design a formalism allowing to treat spin and orbital properties2 of the electrons (and other particles with spin) together.

First of all, one needs to recognize that the spinors and orbital vectors are completely different animals and inhabit different habitats. For instance, while you can represent eigenvectors of momentum and angular momentum in the same, say, position representation or express them in terms of each other, it is impossible to construct a position representation for the eigenvectors of the spin operators or present momentum eigenvectors as a linear combination of spinors. Accordingly, operators acting on orbital vectors do not affect spinors, and spin operators are indifferent to vectors representing orbital states. One of the trivial consequences of this is, of course, that orbital and spin operators always commute. Giving these statements a bit of a thought, you can notice a certain similarity with the just discussed two-spin problem, where we also had to deal with vectors belonging to two unrelated spaces and being acted upon only by their “native” operators. That situation was handled by combining spinors representing spin states of different particles into a common space formed as a tensor product of the spaces of each individual spin. Similarly, spin and orbital spaces of a single particle can also be combined into a tensor product space by stacking together all combinations of the basis vectors from both spaces. Assuming that the orbital space is described

by some discrete basis ˇ̌ ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E based on a set of mutually consistent

observables, a typical basis vector in a compound tensor product space can be made to look something like this:

ˇ̌ ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E jmsii ; (9.56)

2By orbital properties I understand all those properties of the particle that can be described using quantum states related to position or momentum operators or a combination thereof. In what follows I will call these states and vectors representing them orbital states or orbital vectors.

304 9 Spin 1/2

where jmsii is a basis spinor. Since there are only two of those, the dimension of the combined space is two times the dimensionality of the orbital space. Indeed,

attaching the spin state to each orbital basis vector ˇ̌ ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E , I am

generating two new basis vectors:

ˇ̌ ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E j1=2i

and ˇ̌ ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E j�1=2i ;

or, if you prefer,

ˇ̌ ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E j"i

and ˇ̌ ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E j#i :

Sometimes the indicator of a spin state is put inside a single ket or bra vector together with the signifiers of all other observables:

ˇ̌ ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E jmsii �

ˇ̌ ˇq.1/k ; q.2/m ; � � � q.Nmax/p ;msi

E ; (9.57)

but this notation hides the critical difference between the spin and orbital observ- ables and makes some calculations less intuitive, so I would prefer using the notation of Eq. 9.56 most of the time. Nevertheless, sometimes it might be appropriate to use the simplified notation of Eq. 9.57, and if you notice me doing it, do not start throwing stones—this is just a notation, chosen based on convenience and a moment’s expedience.

An arbitrary vector j�i residing in the tensor product space can be presented as

j�i D X

km;���p akm���pI"

ˇ̌ ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E j"i C

X km;���p

akm���pI# ˇ̌ ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E j#i : (9.58)

Expansion coefficients akm���pI" now define the probability ˇ̌ akm���pI"

ˇ̌2 that the

measurement of the mutually consistent observables including a component of the spin will yield values k:m � � � p for regular observables and „=2 for the spin’s component. The set of coefficients akm���pI# defines the probability

ˇ̌ akm���pI#

ˇ̌2 that

9.5 Operator of Total Angular Momentum 305

the observation will produce the same values of all the orbital observables and value �„=2 for the spin. The sum of probabilities

pkm���p D ˇ̌ akm���pI"

ˇ̌2 C ˇ̌akm���pI# ˇ̌2

yields the probability to observe the given values of the observables provided that the spin is not measured, while the sums

p" D X

km;���p

ˇ̌ akm���pI"

ˇ̌2

or

p# D X

km;���p

ˇ̌ akm���pI#

ˇ̌2

generate probabilities of getting values of the spin component „=2 or �„=2, respectively, regardless of the values of other observables. Finally, the normalization condition for the expansion coefficients must now include the summation over all available variables:

X km;���p

hˇ̌ akm���pI"

ˇ̌2 C ˇ̌akm���pI# ˇ̌2i D 1: (9.59)

Equations 9.56–9.58 are written under the assumption that the basis in the orbital space is discrete. However, they can be easily adapted to representations in a continuous basis by replacing all the sums with integrals and probabilities with corresponding probability densities. For instance, in the basis of the position eigenvectors jri, Eqs. 9.56 and 9.58 become jri jmsi and

j�i D ˆ

d3r ".r/ jri j"i C ˆ

d3r #.r/ jri j#i : (9.60)

j ms.r/j2 now gives the position probability density for the corresponding spin state jmsi,

ˇ̌ ".r/

ˇ̌2 C ˇ̌ #.r/ ˇ̌2

yields the same, but when the spin state is not important, and

´ d3r j ms.r/j2 generates the probability of finding the particle in the spin state

jmsi. The normalization Eq. 9.59 now becomes ˆ

d3r hˇ̌ ".r/

ˇ̌2 C ˇ̌ #.r/ ˇ̌2i D 1: (9.61)

One can generate particular representations for the generic vectors j�i by choosing specific bases for the orbital and spinor components of the states. One of the most popular choices is to use the position representation for the orbital vectors and eigenvectors of operator OSz for the spinor component. The respective

306 9 Spin 1/2

representation is generated by premultiplying ˇ̌ ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E by hrj, which

yields

q .1/ k ;q

.2/ m ;���q.Nmax/p .r/ D hr

ˇ̌ ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E ;

and by replacing jmsii with a corresponding two-component column 1

0

� for the

spin-up (or C1=2) state and 0

1

� for the spin-down (or �1=2) state. Since the

coordinate representation for the orbital states is almost always used in conjunction with the representation of spinors in the basis of the eigenvectors of OSz operator, I will call this form the coordinate–spinor representation. Then the combined spin– orbital state takes the form

q .1/ k ;q

.2/ m ;���q.Nmax/p .r/

1

0

�

or

q .1/ k ;q

.2/ m ;���q.Nmax/p .r/

0

1

� ;

depending on the chosen spin state. The generic state vector represented by Eq. 9.58 in this representation becomes (I will keep the same notation for the abstract vector and its coordinate–spinor representation to avoid introducing new symbols, when it is not really necessary and should not cause any confusion)

j�i D X

km;���p akm���pI" q.1/k ;q.2/m ;���q.Nmax/p .r/

1

0

� C

X km;���p

akm���pI# q.1/k ;q.2/m ;���q.Nmax/p .r/ 0

1

� D

‰".r; t/ 1

0

� C‰#.r; t/

0

1

� D ‰".r; t/ ‰#.r; t/

� ; (9.62)

where

‰".r; t/ D X

km;���p akm���pI".t/ q.1/k ;q.2/m ;���q.Nmax/p .r/

‰#.r; t/ D X

km;���p akm���pI#.t/ q.1/k ;q.2/m ;���q.Nmax/p .r/ (9.63)

9.5 Operator of Total Angular Momentum 307

are the orbital wave functions corresponding to spin-up and spin-down states correspondingly. These functions appear in Eq. 9.63 as linear combinations of the initial basis vectors transformed in their position representations. Obviously, ‰".r; t/ and ‰#.r; t/ in these expressions are the same functions, which appear in Eq. 9.60 presenting expansion of an abstract vector j�i in the basis of position eigenvectors.

Any combination of orbital and spin operators act on vectors defined by Eq. 9.58 or 9.62 following a simple rule: orbital operators act on orbital component of the vector, and spin operators affect only its spin component. To illustrate this point, consider the following example.

Example 26 (Using Operators of Orbital and Spin Angular Momentum.) Consider the following vector representing a state of an electron in a hydrogen atom:

j˛i D 2 3

j2; 1;�1i j"i C 1 3

j1; 0; 0i j#i � 1 3

j2; 0; 0i j"i C 1p 3

j2; 1; 1i j#i ;

where the orbital portion of the state follows the standard notation jn; l;mi. Compute the following expressions:

1. h˛j OH j˛i, where OH is the Hamiltonian of a hydrogen atom, Eq. 8.6. 2. � OLC OS� C OL� OSC

� j˛i.

3. � OLz C OSz

� j˛i.

4. Write down vector j˛i in the coordinate–spinor representation. Solution

1. I begin by computing

OH j˛i D �2 3

E1 4

j2; 1;�1i j"i � 1 3

E1 j1; 0; 0i j#i C 1 3

E1 4

j2; 0; 0i j"i

� 1p 3

E1 4

j2; 1; 1i j#i ;

where �E1 is the hydrogen ground state energy. Now I find

h˛j OH j˛i D �E1 9

� E1 9

� E1 36

� E1 12

D �E1 3 ;

where I took into account that all terms in the expression above remain mutually orthogonal, so that all cross-product terms in the inner product vanish. The spin components of the state are not affected by the Hamiltonian because it does not contain any spin operators.

308 9 Spin 1/2

2.

� OLC OS� C OL� OSC �

j˛i D 2 3

p 2„2 j2; 1; 0i j#i C 1p

3

p 2„2 j2; 1; 0i j"i D

r 2

3 „2 j2; 1; 0i

2p 3

j#i C j"i ;

where I applied orbital and spin ladder operators separately to corre- sponding orbital and spin portions of the vectors using correspondingly Eqs. 3.75, 3.76, 5.104, and 5.106. In particular I found that

OLC OS� j2; 1; 0i j#i D OLC j2; 1; 0i OS� j#i D 0

as well as that

OL� OSC j2; 1; 0i j"i D OL� j2; 1; 0i OSC j"i D 0:

3.

� OLz C OSz �

j˛i D �„2 3

j2; 1;�1i j"i C 2 3

„ 2

j2; 1;�1i j"i � 1

3

„ 2

j1; 0; 0i j#i � 1 3

„ 2

j2; 0; 0i j"i C 1p 3

„ j2; 1; 1i j#i � 1p 3

„ 2

j2; 1; 1i j#i D

„ 2

�2 3

j2; 1;�1i j"i � 1 3

j1; 0; 0i j#i � 1 3

j2; 0; 0i j"i C 1p 3

j2; 1; 1i j#i �

4. A coordinate–spinor representation of vector j˛i looks like this: " 2 3 R21.r/Y�11 .�; '/ � 13p4�R20.r/ 1

3 p 4�

R10.r/C 1p 3 R21.r/Y11 .�; '/

# :

If ‰".r; t/ and ‰#.r; t/ can be written down as

‰".r; t/ D a1.t/ .r; t/I ‰#.r; t/ D a2.t/ .r; t/; (9.64)

Eq. 9.62 becomes

j�i D .r; t/

a1.t/ a2.t/

� : (9.65)

resulting in the separation of spin and orbital components of the state. The spin and orbital properties of the particle in such a state are completely independent of each

9.5 Operator of Total Angular Momentum 309

other, and changing one of them wouldn’t affect the other. In a more generic case, when ‰".r/ and ‰#.r/ are two different functions, the orbital state of the particle depends on its spin state and vice versa. This interdependence is called “spin–orbit coupling” and is responsible for many important phenomena. Some of them are old, known for a century, while others have been discovered only recently. For instance, spin–orbit interaction is responsible for the fine structure of atomic spectra (an old phenomenon known from the earlier days of quantum mechanics), but it also gave birth to the entire new “hot” research area in contemporary semiconductor physics known as spintronics. Researchers working in this field seek to control the flow of electrons using their spin as a steering wheel and also to control the orientation of an electron’s spin by affecting its electric current. I will talk more about spin–orbit coupling and its effect on atomic spectra in Chap. 14, but for the spintronics effects, you will have to consult a more specialized book.

While the abstract form of the Schrödinger equation

i„@ j�i @t

D OH j�i

stays the same even when the spin and orbital degrees of freedom are combined, its position representation, which is frequently used for practical calculations, needs to be modified. Indeed, in the representation described by Eq. 9.62, a state of a particle is described by two wave functions corresponding to two different spin states. Respectively, a single Schrödinger equation becomes a system of two equations, whose form depends on the interactions included in the Hamiltonian. To find the explicit form of these equations, you will need to convert operator OH into the combined position–spinor representation. This can be done independently for the orbital and spin portions of the Hamiltonian with the result, which can be presented in the form

OH ! OHms;m0s .r/ � hmsj OH .r/ ˇ̌ m0s ˛

where ms;m0s take values 1 or 2 corresponding, respectively, to ms D 1=2 and ms D �1=2. Thus, the Hamiltonian in the presence of the spin becomes a 2 � 2 matrix, and its action on the state presented in the form of Eq. 9.62 involves (in addition to what it normally does to orbital vectors) the multiplication of a matrix and a spinor. In the most trivial case, when the Hamiltonian does not contain any spin operators and does not act, therefore, on spin states, this matrix becomes

OHms;m0s .r/ � hmsj OH .r/ ˇ̌ m0s ˛ D OH .r/ hms

ˇ̌ m0s ˛ D OH .r/ ıms;m0s

so that the Schrödinger equations for both wave function components ‰".r/ and ‰#.r/ are identical. In this case the total state of the system is described by the vector of the form given by Eq. 9.65, in which the coefficients a1 and a2 of the spinor component can be chosen arbitrarily. Physically this means that in the absence of the

310 9 Spin 1/2

spin-related terms in the Hamiltonian, the spin state of the particle does not change with time and is determined by the initial conditions.

Now let me consider a less trivial case, when the Hamiltonian includes a stand- alone spin operator, something like what we dealt with in Sect. 9.3:

OH D OHorb C 2�BB„ OSz: (9.66)

Here OHorb is a spin-independent portion of the Hamiltonian, and the second term, as you know, describes the interaction of the spin with uniform magnetic field B directed along the Z-axis. In the matrix form, this Hamiltonian becomes

OHms;m0s D OHorbıms;m0s C �BB . O�z/ms;m0s (9.67)

where I used the representation of the spin operators in terms of the corresponding Pauli matrices introduced in Eqs. 9.15–9.17. The explicit matrix form of the stationary Schrödinger equation becomes

OHorb 0 0 OHorb

� ‰".r/ ‰#.r/

� C �BB

1 0

0 �1 � ‰".r/ ‰#.r/

� D E

‰".r/ ‰#.r/

�

and translates into two independent equations:

OHorb‰".r/C �BB‰".r/ D E‰".r/ (9.68) OHorb‰#.r/ � �BB‰#.r/ D E‰#.r/: (9.69)

This independence signifies the absence of any spin–orbit coupling in this system: the functions‰".r/ and‰#.r/ can be chosen in the form of Eq. 9.64 where .r/ is a solution of the orbital equation OHorb .r/ D Eorb .r/. With this, Eqs. 9.68 and 9.69 can be converted into equations

a1 .E � Eorb � �BB/ D 0 a2 .E � Eorb C �BB/ D 0;

yielding two eigenvalues E.1/ D Eorb C �BB and E.2/ D Eorb � �BB, with two respective eigenvectors a.1/1 D 1; a.1/2 D 0 and a.2/1 D 0; a.2/2 D 1. Choosing the zero level of energy at Eorb and disregarding the orbital part of the resulting spinors

ˇ̌ .1/

˛ D .r/ 1

0

�

ˇ̌ .2/

˛ D .r/ 0

1

�

9.5 Operator of Total Angular Momentum 311

which does not affect any of the phenomena associated with the action of the magnetic field on electron’s spin, you end up with eigenvalues

E.1;2/ D ˙�BB

and eigenvectors

ˇ̌ .1/

˛ D 1

0

� I ˇ̌ .2/˛ D

0

1

�

identical to those found for a single spin in the magnetic field in Sect. 9.3. This example demonstrates that the “pure” spin approach, which ignores orbital components of the total state of a particle, is justified as long as the presence of the spin does not change its orbital state, i.e., in the absence of the spin–orbit interaction.

9.5.2 Total Angular Momentum: Eigenvalues and Eigenvectors

In Example 26 in the preceding section, you learned that working in the tensor product of spin and orbital spaces, you can operate with expressions combining orbital and spin operators such as OLz C OSz. This is a z-component of a vector operator

OJ D OL C OS (9.70)

called the operator of a total angular momentum, which plays an important role in the general structure of quantum mechanics as well as in a variety of its applications. For instance, this operator is crucial for understanding the energy levels of hydrogen atom in the presence of spin–orbit coupling and magnetic field; I will introduce you to these topics in Chap. 14. Here my objective is to elucidate the general properties of this operator, which appears to be a logical conclusion to the discussion started in the previous section.

I begin by stating that components of vector OJ obey the same commutation relations as those of its constituent vectors OL and OS. This statement is easy to verify, taking into account that orbital and spin operators commute. For instance, you can check that

OJx OJy � OJy OJx D OLx OLy � OLy OLx C OSx OSy � OSy OSx D i„OLz C i„ OSz D i„ OJz (9.71)

where I canceled terms like OLx OSy � OSy OLx D 0. Once the commutation relations for the components of OJ are established, one can immediately claim that all components of OJ commute with operator OJ2, which can be written down as

OJ2 D OL2 C OS2 C 2 OL � OS: (9.72)

312 9 Spin 1/2

Indeed, the proof of the similar statement for orbital angular momentum carried out in Sect. 3.3.2 was based exclusively on the inter-component commutation relations and is, therefore, automatically expanded to all operators with the same commutation relations. If you go back to Sect. 3.3.4, you will recall that the derivation of the eigenvalues of the orbital angular momentum operators carried out there also relied exclusively on the commutation relations. Therefore, you can immediately claim, without fear of retribution or embarrassment, that operators OJ2 and OJz possess a common system of eigenvectors, characterized by two numbers j and mJ , satisfying inequality �j � mJ � j, taking either integers or half-integer values, and which generate eigenvalues of these operators according to

OJ2 j j;mJi D „2j. j C 1/ j j;mJi (9.73) OJz j j;mJi D „mJ j j;mJi : (9.74)

However, it would be wrong for you to think that Eqs. 9.72 and 9.74 are the end of the story. While these equations do give you some information about the eigenvalues and eigenvectors of OJ2 and OJz, this information is quite limited and does not allow you, for instance, to generate representations of these vectors in any basis except of their own or to help you evaluate the results of the application of various combinations of orbital and spin angular momentum operators to these states. To be able to do all this, you need to answer more rather tough questions such as (a) what is a relation between numbers j, mJ on the one hand and numbers l, s, m, and ms on the other, and (b) how are vectors j j;mJi connected with vectors jl;mli and jmsi? Finding answers to these questions requires substantial additional efforts, so that Eqs. 9.73 and 9.74 are not the end but just the beginning of the journey.

And as a first step, I would note an additional property of the operators OJ2 and OJz, which they possess by the virtue of being the sum of orbital and spin operators:

they both commute with operators OL2 and OS2. Proof of this statement is quite straightforward and is based on Eq. 9.72 as well as on the fact that both OL2 and OS2 commute with all their components (well, OS2 for spin 1=2 is proportional to a unity matrix and, therefore, commutes with everything). This means that operators OJ2, OJz, OL2, and OS2 have a common set of eigenvectors so that numbers j and mJ do not provide a full description of these vectors. To have these vectors fully characterized, one needs to throw number l into the mix replacing j j;mJi with j j; l;mJi and adding equation

OL2 j j; l;mJi D „2l.l C 1/ j j; l;mJi (9.75)

to Eqs. 9.73 and 9.74. Strictly speaking, I would need to include here a spin number s as well, but since I am going to limit this discussion to only spin 1=2 particles, this number never changes so that its inclusion would just superfluously increase the clumsiness of the notations.

9.5 Operator of Total Angular Momentum 313

A relation between vectors j j; l;mJi and individual eigenvectors of the orbital and spin operators can be established by using the latter as a basis in the combined spin–orbital space defined in Sect. 9.5.1 as a tensor product of the orbital and spinor spaces. Specializing a generic Eq. 9.58 to the particular case, when the basis in the orbital space is presented by vectors jl;mi, I can write for an arbitrary member j�i of the tensor product space:

j�i D X

l0;m;ms

Cl 0

m;ms

ˇ̌ l0;m

˛ jmsi : (9.76)

However, when applying this expansion to the particular case of vectors j j; l;mJi, I need to take into account that these vectors are eigenvectors of OL2, i.e., that they must obey Eq. 9.75:

OL2 X

l0;m;ms

Cl 0

m;ms

ˇ̌ l0;m

˛ jmsi D X

l0;m;ms

Cl 0

m;ms OL2 ˇ̌l0;m˛ jmsi D

„2 X

l0;m;ms

Cl 0

m;ms l 0 �l0 C 1� ˇ̌l0;m˛ jmsi D „2l.l C 1/ jl;mi jmsi :

Because of the orthogonality of the basis vectors jl;mi jmsi, the only way to satisfy the equality in the last line is to make sure that l0 D l is the only term in the sum. This is achieved by setting Cl

0

m;ms D Clm;msıl;l0 and thereby vanquishing the summation over l0. In a less formal way, you can argue that for the vector defined by Eq. 9.76 to be an eigenvector of OL2, it cannot be a combination of vectors with different values of l. Thus, I can conclude that a representation of j j; l;mJi in the basis of jl;mi jmsi must have the following form:

j j; l;mJi D X m;ms

Cl;jm;ms;mJ jl;mi jmsi (9.77)

where I also added upper index j and lower index mJ to the expansion coefficients to make it explicit that the expansion is for eigenvectors of operators OJ2 and OJz characterized by quantum numbers j and mJ .

The task now is to find coefficients Cl;jm;msmJ , which are a particular case of so-

called Clebsch–Gordan coefficients.3 To this end I will first apply operator OJz to the left-hand side of Eq. 9.77 and operator OLz C OSz to its right-hand side. Using Eq. 9.74 on the left-hand side and similar properties of orbital and spin angular momentum operators on the right-hand side, I obtain

3Clebsch–Gordan coefficients allow to present eigenvectors of an operator OJ1 C OJ2 in terms of eigenvectors of generic angular momentum operators OJ1 and OJ2.

314 9 Spin 1/2

mJ j j; l;mJi D X m;ms

Cl;jm;ms;mJ .m C ms/ jl;mi jmsi )

mJ X m;ms

Cl;jm;ms;mJ jl;mi jmsi D X m;ms

Cl;jm;ms;mJ .m C ms/ jl;mi jmsi ) X m;ms

Cl;jm;ms;mJ .mJ � m � ms/ jl;mi jmsi D 0:

For the equation in the last line to be true, one of two things should happen: either mJ D m C ms or Cl;jm;ms;mJ D 0. This means that the Clebsch–Gordan coefficients vanish unless m D mJ � ms so that they can be presented as

Cl;jm;ms;mJ D Cl;jms;mJım;mJ�ms: Substituting this result into Eq. 9.77, I can eliminate the summation over m and obtain a simplified form of this expansion:

j j; l;mJi D X ms

Cl;jms;mJ jl;mi jmsi D

Cl;j1=2s;mJ

ˇ̌ ˇ̌l;mJ � 1

2

� j"i C Cl;j�1=2s;mJ

ˇ̌ ˇ̌l;mJ C 1

2

� j#i (9.78)

where the last line explicitly accounts for the fact that the spin number ms only takes two values 1=2 and �1=2. Equation 9.77 contains all the information about Clebsch–Gordan coefficients that I could extract from operator OJz (which is not that much), but hopefully I can learn more from operator OJ2.

The idea is the same: apply OJ2 to the left-hand side of Eq. 9.78, its reincarnation in the form OL2C OS2C2 OL� OS to this equation’s right-hand side, and find conditions that the two sides of the equation agree. The first step is just a recapitulation of Eq. 9.73:

OJ2 j j; l;mJi D „2j. j C 1/ j j; l;mJi D

„2j. j C 1/

Cl;j1=2s;mJ

ˇ̌ ˇ̌l;mJ � 1

2

� j"i C Cl;j�1=2s;mJ

ˇ̌ ˇ̌l;mJ C 1

2

� j#i �

(9.79)

but the second one results in rather long expressions, which couldn’t even fit to a

single page. Therefore, I will deal with different terms in OL2C OS2C2 OL � OS separately. First I will do OL2 C OS2, which is the easiest to handle:

9.5 Operator of Total Angular Momentum 315

� OL2 C OS2 �

Cl;j1=2s;mJ

ˇ̌ ˇ̌l;mJ � 1

2

� j"i C Cl;j�1=2s;mJ

ˇ̌ ˇ̌l;mJ C 1

2

� j#i �

D

„2l.l C 1/

Cl;j1=2s;mJ

ˇ̌ ˇ̌l;mJ � 1

2

� j"i C Cl;j�1=2s;mJ

ˇ̌ ˇ̌l;mJ C 1

2

� j#i �

C

3

4 „2

Cl;j1=2s;mJ

ˇ̌ ˇ̌l;mJ � 1

2

� j"i C Cl;j�1=2s;mJ

ˇ̌ ˇ̌l;mJ C 1

2

� j#i �

C

„2

l.l C 1/C 3 4

Cl;j1=2s;mJ

ˇ̌ ˇ̌l;mJ � 1

2

� j"i C Cl;j�1=2s;mJ

ˇ̌ ˇ̌l;mJ C 1

2

� j#i � :

(9.80)

To evaluate the remaining OL � OS term, I first give it a makeover using ladder operators OL˙ and OS˙:

OL � OS D OLx OSx C OLy OSy C OLz OSz D OLz OSz C 1

2

� OLC C OL� � 1 2

� OSC C OS� �

C 1

2i

� OLC � OL� � 1 2i

� OSC � OS� �

D

OLz OSz C 1 2

� OL� OSC C OLC OS� �

(9.81)

where I used Eqs. 3.59 and 3.60 for orbital and Eqs. 5.109 and 5.108 for spin opera- tors. Using the fact that

ˇ̌ l;mJ � 12

˛ j"i and ˇ̌l;mJ C 12 ˛ j#i are eigenvectors of OLz and

OSz with eigenvalues „ .mJ � 1=2/, „=2 and „ .mJ C 1=2/, -„=2 correspondingly, I get for the first term in the last line of Eq. 9.81:

OLz OSz

Cl;j1=2s;mJ

ˇ̌ ˇ̌l;mJ � 1

2

� j"i C Cl;j�1=2s;mJ

ˇ̌ ˇ̌l;mJ C 1

2

� j#i �

D

„2 2

mJ � 1

2

Cl;j1=2s;mJ

ˇ̌ ˇ̌l;mJ � 1

2

� j"i �

„2 2

mJ C 1

2

Cl;j�1=2s;mJ

ˇ̌ ˇ̌l;mJ C 1

2

� j#i : (9.82)

To compute the contribution from OL� OSC and OLC OS�, you need to recall that OSC j"i D 0, OS� j#i D 0, OSC j#i D „ j"i, OS� j"i D „ j#i (These formulas originally appeared in Sect. 5.2.3, Eqs. 5.104 and 5.106, but I am reposting them here for your convenience.) You will also need to go back to Eqs. 3.75 and 3.76 to figure out the part related to operators OL˙. Refreshing this way your memory of the ladder operators, you can get

316 9 Spin 1/2

OL� OSC

Cl;j1=2s;mJ

ˇ̌ ˇ̌l;mJ � 1

2

� j"i C Cl;j�1=2s;mJ

ˇ̌ ˇ̌l;mJ C 1

2

� j#i �

D

„2 s

l.l C 1/ �

mJ C 1 2

mJ � 1

2

Cl;j�1=2s;mJ

ˇ̌ ˇ̌l;mJ � 1

2

� j"i (9.83)

and

OLC OS�

Cl;j1=2s;mJ

ˇ̌ ˇ̌l;mJ � 1

2

� j"i C Cl;j�1=2s;mJ

ˇ̌ ˇ̌l;mJ C 1

2

� j#i �

D

„2 s

l.l C 1/ �

mJ C 1 2

mJ � 1

2

Cl;j1=2s;mJ

ˇ̌ ˇ̌l;mJ C 1

2

� j#i : (9.84)

Finally, you just need to bring together all Eqs. 9.80–9.84 and apply some simple algebra (just group together the like terms) to cross the goal line:

� OL2 C OS2 C 2 OL � OS �

Cl;j1=2s;mJ

ˇ̌ ˇ̌l;mJ � 1

2

� j"i C Cl;j�1=2s;mJ

ˇ̌ ˇ̌l;mJ C 1

2

� j#i �

D

„2

Cl;j1=2s;mJ

l.l C 1/C mJ C 1

4

C

s l.l C 1/ �

mJ C 1

2

mJ � 1

2

Cl;j�1=2s;mJ

# ˇ̌ ˇ̌l;mJ � 1

2

� j"i C

„2

Cl;j�1=2s;mJ

l.l C 1/ � mJ C 1

4

C

s l.l C 1/ �

mJ C 1

2

mJ � 1

2

Cl;j1=2s;mJ

# ˇ̌ ˇ̌l;mJ C 1

2

� j#i :

Comparing this against Eq. 9.79 and equating coefficients in front of each of the vectors, you will end up with the following system of equations for coefficients Cl;j1=2s;mJ and C

l;j �1=2s;mJ :

l.l C 1/ � j . j C 1/C mJ C 1

4

Cl;j1=2s;mJ C

s l.l C 1/ �

mJ C 1

2

mJ � 1

2

Cl;j�1=2s;mJ D 0 (9.85)

9.5 Operator of Total Angular Momentum 317

s l.l C 1/ �

mJ C 1

2

mJ � 1

2

Cl;j1=2s;mJ C

l.l C 1/ � j . j C 1/C 1

4 � mJ

Cl;j�1=2s;mJ D 0: (9.86)

And once again you are looking for non-zero solutions of a homogeneous system of linear equations, and once again you need to find zeroes of the determinant formed by its coefficients:

������ l.l C 1/ � j . j C 1/C mJ C 14 I

q l.l C 1/ � �mJ C 12

� � mJ � 12

� q

l.l C 1/ � �mJ C 12 � �

mJ � 12 �I l.l C 1/ � j . j C 1/C 1

4 � mJ

������ D 0:

Evaluation of the determinate yields

l.l C 1/ � j . j C 1/C 1

4 C mJ

l.l C 1/ � j . j C 1/C 1

4 � mJ

�

l.l C 1/C

mJ C 1 2

mJ � 1

2

D

l.l C 1/ � j . j C 1/C 1

4

2 � l.l C 1/ � 1

4 D

" l C 1

2

2 � j . j C 1/

#2 �

l C 1 2

2

where I used easily verified identity

l.l C 1/C 1 4

�

l C 1 2

2 : (9.87)

Now it is quite easy to find that equation

" l C 1

2

2 � j . j C 1/

#2 �

l C 1 2

2 D 0

is satisfied for

j. j C 1/ D

l C 1 2

l C 3

2

or

j. j C 1/ D �l C 1 2

� � l � 1

2

� :

318 9 Spin 1/2

The only physically meaningful solutions of these equations are

j1 D l C 1 2

(9.88)

and

j2 D l � 1 2 : (9.89)

(Two other solutions �l � 3=2 and �l � 1=2 are negative and must be ignored.) The obtained result means that for any value of the orbital quantum number l; operator OJ2 has two possible eigenvalues „2j1 . j1 C 1/ and „2j2 . j2 C 1/ with j1 and j2 defined above. For each value of j, there are 2j C 1 values of mJ , mJ D �j;�j C 1 � � � j � 1; j so that the total number of states j j; l;mJi (for a given l) is 2 .l C 1=2/ C 1 C 2 ..l � 1=2/C 1/ D 2 .2l C 1/, which is exactly the same as the number of states jl;mi jmsi. One important conclusion from this arithmetic is that orthogonal and linearly independent states j j; l;mJi and other orthogonal and independent states jl;mi jmsi represent two alternative bases in the same vector space: vectors of the former basis are defined by the states in which the measurement of the total angular momentum and its component would yield determinate results, and vectors of the latter basis correspond to the states in which orbital and spin momenta separately would have definite values.

Now I can go back to Eqs. 9.85 and 9.86 and find the Clebsch–Gordan coeffi- cients that establish a connection between vectors j j; l;mJi and vectors jl;mi jmsi, signaling the close end of this journey. Substituting the found values for j1 and j2 to Eqs.9.85 and 9.86, I find the two sets of the coefficients:

Cl;j1�1=2s;mJ D l C 1

2 � mJq

l.l C 1/ � m2J C 14 Cl;j11=2s;mJ D

s l C 1

2 � mJ

l C 1 2

C mJ Cl;j11=2s;mJ (9.90)

Cl;j21=2s;mJ D � l C 1

2 � mJq

l.l C 1/ � m2J C 14 Cl;j2�1=2s;mJ D �

s l C 1

2 � mJ

l C 1 2

C mJ Cl;j2�1=2s;mJ (9.91)

where I again used Eq. 9.87. As usual, Eqs. 9.85 and 9.86 yield only the ratio of the coefficients, and in order to find the coefficients themselves, the normalization requirement, complemented by the convention that the Clebsch–Gordan coefficients remain real, needs to be invoked. Substituting Eqs. 9.90 and 9.91 into the normal- ization condition

ˇ̌ ˇCl;j�1=2s;mJ

ˇ̌ ˇ 2 C

ˇ̌ ˇCl;j1=2s;mJ

ˇ̌ ˇ 2 D 1;

9.5 Operator of Total Angular Momentum 319

I find after some trivial algebra

Cl;j11=2s;mJ D s

l C 1 2

C mJ 2l C 1 I C

l;j1 �1=2s;mJ D

s l C 1

2 � mJ

2l C 1

Cl;j21=2s;mJ D s

l C 1 2

� mJ 2l C 1 I C

l;j2 �1=2s;mJ D �

s l C 1

2 C mJ

2l C 1 : (9.92)

Now you just plug Eq. 9.92 into Eq. 9.78 to derive the final expressions for the two eigenvectors of operator OJ2 characterized by quantum numbers j1 and j2 in terms of linear combination of the orbital and spin angular momentum eigenvectors:

jl C 1=2; l;mJi D 1p 2l C 1

p l C mJ C 1=2

ˇ̌ ˇ̌l;mJ � 1

2

� j"i C

p l � mJ C 1=2

ˇ̌ ˇ̌l;mJ C 1

2

� j#i �

(9.93)

jl � 1=2; l;mJi D 1p 2l C 1

p l � mJ C 1=2

ˇ̌ ˇ̌l;mJ � 1

2

� j"i �

p l C mJ C 1=2

ˇ̌ ˇ̌l;mJ C 1

2

� j#i � : (9.94)

It is quite easy to verify that vectors jl C 1=2; l;mJi and jl � 1=2; l;mJi are normalized and orthogonal, as they shall be. One can interpret this result by saying that if an electron is prepared in a state with determinate values of total angular momentum „2j. j C 1/, one of its components, „mJ , and total orbital momentum „2l.l C 1/, the values of the corresponding components of its orbital momentum „m and spin „ms remain uncertain. An attempt to measure them will produce the combination m D mJ � 1=2, ms D 1=2 with probabilities

pmJ�1=2;1=2 D (

lCmJC1=2 2lC1 ; j D l C 1=2

l�mJC1=2 2lC1 j D l � 1=2

(9.95)

or combination m D mJ C 1=2, ms D �1=2 with probabilities

pmJC1=2;�1=2 D (

l�mJC1=2 2lC1 ; j D l C 1=2

lCmJC1=2 2lC1 j D l � 1=2

: (9.96)

To help you feel better about these results, let me illustrate the application of Eqs. 9.95 and 9.96 by a few examples.

320 9 Spin 1/2

Example 27 (Measuring Spin and Orbital Angular Momentums in the State with the Definite Value of the Total Angular Momentum) Assume that an electron is in a state with a given orbital momentum l, total angular momentum j D l � 1=2, and its z-component mJ D l � 3=2 and that you have a magic instrument allowing you to measure the z-components of electron’s orbital momentum and its spin. What are the possible outcomes of such a measurement and their probabilities?

Solution

Value mJ D l � 3=2 can be obtained in two different ways—when ms D 1=2 and m D l � 2 or ms D �1=2 and m D l � 1. The probability of the first outcome is (second line in Eq. 9.95)

pl�2;1=2 D l � .l � 3=2/C 1=2 2l C 1 D

2

2l C 1 ;

and the probability of the second outcome (second line in Eq. 9.96) is

pl�1;�1=2 D l C .l � 3=2/C 1=2 2l C 1 D

2l � 1 2l C 1 :

Obviously the sum of the two probabilities is equal to one, and for large values of l, the second outcome is significantly more probable.

Example 28 (More on Measurement of Spin and Orbital Momentums) Let me modify the previous example by assuming that the value of the total angular momentum is not known, but it is known that the electron can be in either state of the total angular momentum with equal probability. How will the answer to the previous example change in this case?

Solution

Now you have to take into account that both possible outcomes discussed in the previous example can come either from the state with j D l C 1=2 or the state with j D l � 1=2: Respectively, the total probability of the outcomes becomes

pl�2;1=2 D 1 2

l � .l � 3=2/C 1=2 2l C 1 C

1

2

l C .l � 3=2/C 1=2 2l C 1 D

l C 1=2 2l C 1 D

1

2

pl�1;�1=2 D 1 2

l C .l � 3=2/C 1=2 2l C 1 C

1

2

l � .l � 3=2/C 1=2 2l C 1 D

1

2 :

Even though, generally speaking, either mJ or m and ms cannot be known with certainty in the same state, there exist two states in which all three of these quantum numbers have definite values. These are the states with the largest mJ D l C 1=2 and smallest mJ D �l � 1=2 values of mJ , for which one of the Clebsch–Gordan coefficients vanishes, while the other one turns to unity, reducing Eq. 9.93 to

jl C 1=2; l; l C 1=2i D jl; li j"i I jl C 1=2; l;�l � 1=2i D jl;�li j#i :

9.6 Problems 321

You can easily understand this fact by noting that mJ D l C 1=2 or mJ D �l � 1=2 can be obtained only by a single combination of m and ms: mJ D lC1=2 corresponds to the choice m D l and ms D 1=2, while mJ D �l � 1=2 can only be generated by m D �l, ms D �1=2.

Equations 9.88 and 9.89 together with Eqs. 9.93 and 9.94 provide answers to all the questions posed in the beginning of this subsection: you now know the relation between total, orbital, and spin angular momentum quantum numbers as well as between corresponding eigenvectors. In particular, Eqs. 9.93 and 9.94 allow generating any representation for j j; l;mJi, using corresponding representations for vectors jl;mi and jmsi, as well as define the action of any combination of orbital and spin operators on these vectors. To illustrate this point, I will write down the coordinate–spinor representation of jl C 1=2; l;mJi using Eq. 9.93 and the corresponding representations for jl;mi and jmsi:

jl C 1=2; l;mJi D 1p 2l C 1

"p l C mJ C 1=2YmJ�1=2l .�; '/p l � mJ C 1=2YmJC1=2l .�; '/

# :

I can now use this to compute, e.g., OLC OS� jl C 1=2; l;mJi. Taking the matrix representation for OS� from Eq. 5.107 and recalling that any orbital operator in the spinor representation is multiplied by a unity matrix, I can rewrite this expression as

OLC OS� jl C 1=2; l;mJi D „p 2l C 1

" OLC 0 0 OLC

#" 0 0

1 0

#"p l C mJ C 1=2YmJ�1=2l .�; '/p l � mJ C 1=2YmJC1=2l .�; '/

# D

„pl C mJ C 1=2p 2l C 1

" OLC 0 0 OLC

#" 0

YmJ�1=2l .�; '/

# D „

p l C mJ C 1=2p 2l C 1

" 0

OLCYmJ�1=2l .�; '/

# D

„pl C mJ C 1=2p 2l C 1

p l.l C 1/ � .mJ � 1=2/ .mJ C 1=2/

" 0

YmJC1=2l .�; '/

# D

„ .l C mJ C 1=2/ s

l � mJ C 1=2 2l C 1

" 0

YmJC1=2l .�; '/

# :

9.6 Problems

Section 9.2

Problem 112 Write down a spinor corresponding to the point on the Bloch sphere with coordinates � D �=4, ' D 3�=2.

322 9 Spin 1/2

Problem 113 The impossibility of half-integer values of the angular momentum for orbital angular momentum operators expressed in terms of coordinate and momentum operators can be demonstrated by considering the following example. Imagine that there exists a state of the orbital angular momentum with l D 1=2. Then in the coordinate representation, these states would be represented by two functions f1=2.�; '/ and f�1=2.�; '/ corresponding to the values of the magnetic quantum number m D 1=2 and m D �1=2, respectively. These functions must obey the following set of equations:

OLCf1=2.�; '/ D 0I OL�f�1=2.�; '/ D 0 OLCf�1=2.�; '/ D f1=2.�; '/I OL�fC1=2.�; '/ D f�1=2.�; '/:

Using the coordinate representation of the ladder operators, show that these equations are mutually inconsistent.

Problem 114 An electron is in spin state described by (non-normalized) spinor:

j�i D 2i � 3 4

� :

1. Normalize this spinor. 2. If you measure the z-component of the spin, what are the probabilities of various

outcomes? 3. What is the expectation value of the z-component of the spin in this state? 4. Answer the same questions for x- and y-components.

Problem 115

1. Consider a spin in state

1

0

� :

You measure the component of the spin in the direction of the unit vector n characterized by angles �; ' of the spherical coordinate system. What is a probability of obtaining value �„=2 as an outcome of this measurement?

2. Imagine that you conduct two measurements in a quick succession: first you carry out the measurement described in the previous part of the problem, and right after that, you measure the y-component of the spin. Find the probability of getting „=2 as an outcome of the last measurement. (Hint: Do not forget to consider all possible paths that could lead to this outcome.)

Problem 116 Consider a particle with spin 1=2 in a state in which a component of the spin in a specified direction is equal to „=2. Choose a coordinate system with the Z-axis along this direction and some arbitrary positions for X- and Y-axes in the perpendicular plane. Now imagine that you measure a component of the spin in

9.6 Problems 323

a direction making angle 30ı with the Z-axis and lying in the XZ plane. Find the probabilities of the various outcomes of this measurement.

Section 9.3

Problem 117 Derive the expression for the expectation value of the y-component of the spin in the state specified by Eq. 9.34.

Problem 118 Consider a spin in the initial state characterized by angles � D �=6 and ' D �=3 of the Bloch sphere. At time t D 0, the magnetic field B directed along the polar axes of the spherical coordinate system is turned on and remains on for t D �= .2!L/ seconds. After the field is off, an experimentalist measures the z-component of the spin. What is the probability that the measurement yields „=2? �„=2? Answer the same questions if it is the x-component of the spin that is being measured.

Problem 119 In the last problem to Chap. 5, you found matrices OSx, OSy, and OSz for a particle with spin 3=2. Assume that an interaction of this particle with its surrounding is described by Hamiltonian:

OH D "0„2 � OS2x � OS2y

� � "1„2

OS2z :

1. Find the stationary states of this Hamiltonian. 2. Assuming that the initial state of the particle is given by a generic spinor of the

form

j�0i D

2 664

1

0

0

0

3 775 ;

find the spin state of the particle at time t. 3. Calculate the time-dependent expectation values of all three components of the

spin operator.

Problem 120 Consider a spin 1=2 particle in a time-dependent magnetic field, which rotates with angular velocity � in the X–Y plane:

B D iB0 cos˝t C jB0 sin�t;

where i and j are unit vectors in the directions of X and Y coordinate axes, respectively. Derive the Heisenberg equations for the spin operators and solve them. Note, since the Hamiltonian of this system is time-dependent, you cannot claim the same form for the Hamiltonian in Schrödinger and Heisenberg pictures based upon

324 9 Spin 1/2

the notion that the time-evolution operator OU commutes with the Hamiltonian (it does not because it does not have the form of exp

� �i OHt=„

� , which is only valid for

time-independent Hamiltonians). Nevertheless, since the time-dependent factor in the Hamiltonian does not contain operators, you can still show that the Heisenberg form of the Hamiltonian, which in the Schrödinger picture has the form

OH D 2�B„ OS � B;

has exactly the same form in the Heisenberg picture if the Schrödinger spin operator is replaced with its time-dependent Heisenberg operator.

1. Convince yourself that this is, indeed, the case. 2. Derive the Heisenberg equations for all three components of the spin operators. 3. Solve these equations and find the time dependence of the spin operators. (Hint:

You might want to introduce new time-dependent operators defined as

OP D OSx cos�t C OSy sin�t OQ D OSy cos�t � OSx sin�t

and derive equations for them.)

Section 9.4

Problem 121 Normalize the following vector belonging to the tensor product of two spaces:

j i D 2i ˇ̌ ˇe.1/1

E �ˇ̌ ˇe.2/1

E � 3i

ˇ̌ ˇe.2/2

E� C � 2 ˇ̌ ˇe.1/1

E � 3

ˇ̌ ˇe.1/2

E� ˇ̌ ˇe.2/2

E ;

assuming that vectors ˇ̌ ˇe.1/1;2

E and

ˇ̌ ˇe.2/1;2

E are normalized and mutually orthogonal.

Problem 122 Compute commutators h OS.tp/i ; OS.tp/j

i for all i ¤ j and

OS.tp/i ;

� OS.tp/ �2�

,

where i; j take values x; y; and z.

Problem 123 Assuming that vectors ˇ̌ ˇe.1/1;2

E and

ˇ̌ ˇe.2/1;2

E in Problem 121 correspond

to spin-up and spin-down states of two particles as defined by operators OS.1;2/z correspondingly, compute

h j OS.1/ � OS.2/ j i ;

where vector j i is also defined in Problem 121.

9.6 Problems 325

Problem 124 Derive Eqs. 9.46 through 9.49.

Problem 125 Consider a system of two interacting spins described by Hamilto- nian:

OH D 2�B„ OS.1/B C 2�B„

OS.2/B C J OS.1/ � OS.2/:

Find the eigenvalues and eigenvectors of this Hamiltonian. Do it in two different ways: first, use eigenvectors of individual OS.1;2/z operators as a basis, and second, use eigenvectors of the operators of the total spin. Find the ground state of the system for different relations between the magnetic field and parameter J. Consider cases J > 0 and J < 0.

For Sect. 9.5.1

Problem 126 Using the approach presented in Sect. 9.4, consider addition of the operators of the orbital angular momentum and spin, limiting your consideration to the orbital states with l D 1. 1. Construct the matrix of the operator OJ2, where OJ D OL C OS, in the basis

of eigenvectors of operators OL2, OLz, and OSz, taking into account only those eigenvectors which belong to the orbital quantum number l D 1. (Hint: Your basis will consist of 6 vectors, so that you are looking for a 6 � 6 matrix.)

2. Diagonalize the matrix and confirm that eigenvectors of OJ2 are characterized by quantum numbers j D 1=2 and j D 3=2.

3. Find the eigenvectors of OJ2 in this basis. Problem 127

1. Write down an expression for a spinor describing the equal superposition of states, in which an electron in the ground state of an infinite one-dimensional potential is also in a spin-up state, while an electron in the first excited state of this potential is also in the spin-down state. The potential confines the electron’s motion in x direction, while spin-up and spin-down states correspond to the z-component of the spin.

2. Imagine that you have measured a component of the spin in the x direction and obtained value „=2. Find the probability distribution of the electron’s coordinate right after this measurement.

Problem 128 A one-dimensional harmonic oscillator is placed in a state

j˛i D 1p 2 Œj0i j"i C j1i j#i� ;

326 9 Spin 1/2

where spin-up and spin-down states are defined with respect to the z-component of the spin operator and kets j0i and j1i correspond to the ground state and the first excited state of a harmonic oscillator. At time t D 0 an experimentalist turns on a uniform magnetic field in the z direction. Find the state of the system at a later time t, and compute the expectation values of oscillator’s coordinate and momentum. (Hint: You can use Eqs. 9.68 and 9.69 with the orbital part of the Hamiltonian taken to be that of a harmonic oscillator.)

For Sect. 9.5.2

Problem 129 Compute the expectation value of all components of the operator

OJ D OL C OS

as well as of operator OJ2 in state

j�i D 1p 14

Yl�2l .�; '/

ˇ̌ ˇ̌1 2

� � 2Yll .�; '/

ˇ̌ ˇ̌�1 2

� C 3iY2l .�; '/

ˇ̌ ˇ̌1 2

�� :

Problem 130 Derive Eq. 9.92.

Problem 131 Consider an electron in a state with l D 2, j D 3=2, and mJ D 0. If one measures the z-components of the electron orbital momentum and spin, what are the possible values and their probabilities?

Problem 132 Let me reverse the previous problem: assume that the electron is in the state with l D 2, m D 1, and ms D �1=2. What are the possible values of j and their probabilities?

Problem 133 Consider an electron in the following state (in the coordinate repre- sentation):

j˛i D 2p 10

Y11 .�; '/

ˇ̌ ˇ̌1 2

� C 1p

10 Y02 .�; '/

ˇ̌ ˇ̌�1 2

� C 1p

10 Y�11 .�; '/

ˇ̌ ˇ̌1 2

�

C 2p 10

Y12 .�; '/

ˇ̌ ˇ̌�1 2

� :

1. If one measures OJ2 and OJz, what values can one expect to observe and what are their probabilities?

2. Present this vector as a linear combination of appropriate vectors j j; l;mJi.

9.6 Problems 327

Section 9.5.2

Problem 134 Compute commutators h OJy; OJz

i and

h OJx; OJz i , and demonstrate that

they have a standard for the angular momentum operators form.

Problem 135 Write down the position–spinor representation of vector jl � 1=2; l;mJi, and compute OL� OSC jl � 1=2; l;mJi using this representation.

Chapter 10 Two-Level System in a Periodic External Field

I have already mentioned somewhere in the beginning of this book that while vectors representing states of realistic physical systems generally belong to an infinite- dimensional vector space, we can always (well, almost, always) justify limiting our consideration to a subspace of states with a reasonably small dimension. The smallest nontrivial subspace containing states that can be assumed to be isolated from the rest of the space is two-dimensional. One relatively clean example of such a subspace is formed by two-dimensional spinors in the situations when one can neglect interactions between spins of different particles as well as by the spin–orbital interaction. An approximately isolated two-dimensional subspace can also be found in systems described by Hamiltonians with discrete spectrum, if this spectrum is strongly non-equidistant, i.e., the energy intervals between adjacent energy levels 4i D EiC1 � Ei are different for different pairs of levels. Two-level models are very popular in various areas of physics because, on one hand, they are remarkably simple, while on the other hand, they capture essential properties of many real physical systems ranging from atoms to semiconductors.

The most popular (and useful) version of this model involves an interaction of a two-level system with a periodic time-dependent external “potential.” This can be an electric dipole potential describing interaction of an atomic electron with electric field or magnetic “potential” describing interaction of an electron spin with time- dependent magnetic field. Since I am not going to go into concrete details of a physical system, which this model is supposed to represent, I will introduce it by assuming that its Hamiltonian is a sum of a time-independent “unperturbed” part OH0 and the time-dependent “perturbation” OV.t/. I will also assume that OH0 has only two linearly independent and orthogonal eigenvectors, which I will designate as j1i and j2i, and two corresponding eigenvalues E.0/1 and E.0/2 , which may be degenerate.

It is easy to see now that OH0 can be written as

OH0 D E.0/1 j1i h1j C E.0/2 j2i h2j : (10.1)

© Springer International Publishing AG, part of Springer Nature 2018 L.I. Deych, Advanced Undergraduate Quantum Mechanics, https://doi.org/10.1007/978-3-319-71550-6_10

329

330 10 Two-Level System in a Periodic External Field

Indeed, taking into account the orthogonality and normalization of j1i and j2i, you can find

OH0 j1i D E.0/1 j1i h1j 1i C E.0/2 j2i h2j 1i D E.0/1 j1i ;

and

OH0 j2i D E.0/1 j1i h1j 2i C E.0/2 j2i h2j 2i D E.0/2 j2i ;

confirming that the Hamiltonian given by Eq. 10.1 does, indeed, have the properties prescribed to it. It is obvious that in the basis of these eigenvectors, OH0 is presented by a diagonal matrix with eigenvalues along the main diagonal. In the most general form, the interaction term can be written down as

OV D V11 j1i h1j C V22 j2i h2j C V12 j1i h2j C V21 j2i h1j :

The diagonal elements in this expression, Vii.t/ D hij OV jii, often vanish, thanks to the symmetry of the system. Indeed, if the initial Hamiltonian is symmetric with respect to inversion, its eigenvectors have definite parity—they are either odd or even. If, in addition, the interaction Hamiltonian is odd (which is quite common— for instance, the electric–dipole interaction is proportional to Or � E , where E is the electric field, and position operator changes sign upon inversion), the diagonal elements of the interaction term must vanish (details of the arguments can be found in Sect. 7.1). Also, the requirement that the operator must be Hermitian demands that V21 D V�12.

10.1 Two-Level System with a Time-Independent Interaction: Avoided Level Crossing

I begin by considering the properties of the two-level model with a time-independent interaction term, so that the complete Hamiltonian of the system becomes

OH D E.0/1 j1i h1j C E.0/2 j2i h2j C V12 j1i h2j C V�12 j2i h1j (10.2)

where Vij are in general complex constant parameters. Since this is a time- independent Hamiltonian, it makes sense to explore its eigenvectors and eigenvalues using vectors j1i and j2i as a basis. The Hamiltonian in this representation becomes a 2 � 2 matrix so that the eigenvector equation can be written in the matrix form

" E.0/1 V12 V�12 E

.0/ 2

# a1 a2

� D E

a1 a2

� ; (10.3)

10.1 Two-Level System with a Time-Independent Interaction: Avoided Level. . . 331

and the corresponding equation for the eigenvalues becomes

����� E.0/1 � E V12

V�12 E .0/ 2 � E

����� D 0:

Evaluation of the determinant turns it into a simple quadratic equation:

E2 � E �

E.0/1 C E.0/2 �

C E.0/1 E.0/2 � jV12j2 D 0

with two solutions (I provided a lot of detailed derivations in this book, but I am not going to show how to solve quadratic equations!)

E1 D 1 2

� E.0/1 C E.0/2

� C 1 2

r� E.0/1 � E.0/2

�2 C 4 jV12j2 (10.4)

E2 D 1 2

� E.0/1 C E.0/2

� � 1 2

r� E.0/1 � E.0/2

�2 C 4 jV12j2: (10.5)

Substituting the first of these solutions into

� E.0/1 � E

� a1 C V12a2 D 0

(the first of the equations encoded in the matrix form in Eq. 10.3), I find the ratio of the coefficients representing the first eigenvector of the Hamiltonian:

a.1/1 a.1/2

D �2 V12 E.0/1 � E.0/2 �

r� E.0/1 � E.0/2

�2 C 4 jV12j2 : (10.6)

Repeating this calculation with the second eigenvalue, I find the ratio of the coefficients for the second eigenvector:

a.2/1 a.2/2

D �2 V12 E.0/1 � E.0/2 C

r� E.0/1 � E.0/2

�2 C 4 jV12j2 : (10.7)

The normalization coefficients for these eigenvectors are too cumbersome and are not too informative, so I will leave the eigenvectors non-normalized. Both of them can be written as a superposition of vectors j1i and j2i with coefficients a.1;2/1;2 defined by Eqs. 10.6 and 10.7:

jE1;2i D a.1;2/1 j1i C a.1;2/2 j2i (10.8)

332 10 Two-Level System in a Periodic External Field

where I used eigenvalues to label the corresponding eigenvectors. The ratios of the coefficients in this superposition determine relative contribu-

tions of each of the original states into jE1;2i. These ratios depend on the relation between the inter-level spectral distance

ˇ̌ ˇE.0/1 � E.0/2

ˇ̌ ˇ and the interaction matrix

element jV12j. If the former is much larger than the latter, I can expand the denominators of Eqs. 10.6 and 10.7 as

r� E.0/1 � E.0/2

�2 C 4 jV12j2 � E.0/1 � E.0/2 C 2 jV12j2

E.0/1 � E.0/2

where it is assumed for concreteness that E.0/1 > E .0/ 2 . Then Eqs. 10.6 and 10.7 yield

a.1/1 a.1/2

� V12

� E.0/1 � E.0/2

�

jV12j2 1

a.2/1 a.2/2

� � V12 E.0/1 � E.0/2

� 1:

Thus, the contributions of the state presented by vector j2i into the eigenvector jE1i and of state j1i into the eigenvector jE2i are very small. Not surprisingly, the energy E1 in this limit is close to E

.0/ 1 , and E2 is close to E

.0/ 2 (check it out, please).

These results justify the assumption lying in the foundation of the two-level model: contributions from energetically remote states can, indeed, be neglected. It also provides a quantitative condition for validity of this approximation:

ˇ̌ E.0/n � E.0/m

ˇ̌ jVnmj, where n;m are the labels for energy levels and the corresponding states.

It is easy to verify that if I reversed inequality E.0/1 > E .0/ 2 and assumed instead

that E.0/1 < E .0/ 2 , the role of vectors j1i and j2i would have interchanged: the main

contribution to state jE1i would have come from initial vector j2i, and state jE2i would have been mostly determined by j1i. This flipping between the initial vectors is due to trivial but often overlooked property of the square root,

p x2=jxj, which

is x when x is positive and �x when it is negative. In one of the exercises, you are asked to verify this flipping phenomenon.

In the opposite limit ˇ̌ ˇE.0/1 � E.0/2

ˇ̌ ˇ � jV12j, the radical in Eqs. 10.6 and 10.7 can

be approximated as

r� E.0/1 � E.0/2

�2 C 4 jV12j2 � 2 jV12j (10.9)

which is valid with accuracy up to terms of the order of �

E.0/1 � E.0/2 �2 = jV12j2 � 1.

The ratios of the coefficients in this case become

10.1 Two-Level System with a Time-Independent Interaction: Avoided Level. . . 333

a.1/1 a.1/2

D �2 V12 E.0/1 � E.0/2 � 2 jV12j

� eiıV 1C E

.0/ 1 � E.0/2 2 jV12j

!

a.2/1 a.2/2

D �2 V12 E.0/1 � E.0/2 C 2 jV12j

� �eiıV 1 � E

.0/ 1 � E.0/2 2 jV12j

!

where I introduced the phase of the matrix element V12 D jV12j exp .iıV/ and used approximation for .1C x/�1 � 1 � x. Note that the correction to the main terms (˙ exp ŒiıV �) in both expressions is linear in

� E.0/1 � E.0/2

� = jV12j, which justifies

approximation for the radical used in Eq. 10.9 (neglected quadratic terms are smaller than the linear ones kept in the expressions for the coefficients). The contributions of the initial eigenvectors in this limit are almost equal to each other in magnitude while differing in their phase by � (do I need to remind you that �1 D exp .i�/?). Approximate expressions for the energy eigenvalues, Eqs. 10.6 and 10.7 in this limit,

become (again neglecting quadratic terms in �

E.0/1 � E.0/2 � = jV12j)

E1 D 1 2

� E.0/1 C E.0/2

� C jV12j (10.10)

E2 D 1 2

� E.0/1 C E.0/2

� � jV12j : (10.11)

What is significant about this result is that even when the difference between initial energy levels is very small compared to the matrix element of the interaction, the difference between the actual eigenvalues is jV12j and is not small at all.

Experimentalists love the two-level models because they are simple (all what you need to know is how to solve quadratic equations), and they are tempted to use it as often as they can in disparate fields of physics. Theoreticians, of course, hate this model with as much fervor because if all of the physics could have been explained by a two-level model, all theoreticians would have lost their jobs. Luckily, this is not the case.

The physics described by this model becomes particularly interesting (and important) if the initial Hamiltonian OH0 depends on some parameters, which can be controlled experimentally in such a way that the sign of the difference E.0/1 � E.0/2 can be continuously altered. In this case, at certain value of this parameter, the two initial energy levels become degenerate, and if one plots dependence of E.0/1 and

E.0/2 as functions of this parameter, the corresponding curves would cross at some point. This is an example of an accidental degeneracy, which is not related to any symmetry and occurs only at particular values of a system’s parameters. Still, it happens in a number of physical systems and is of great interest because it affects how the system reacts to various stimuli. If, however, one plots the dependence of the actual eigenvalues as functions of the same parameter, the curves would not

334 10 Two-Level System in a Periodic External Field

Fig. 10.1 An example of avoided crossing

External parameter E

ne rg

y

cross each other as is obvious from Eqs. 10.10 and 10.11. The curves representing this dependence will now look like the ones shown in Fig. 10.1. You can see that the curves do not cross each other anymore giving this phenomenon the name of avoided level crossing.

This is a remarkable phenomenon, which is not easily appreciated. Let me try to help you to understand what is so special about these two curves not crossing each other. Let’s begin far on the left from the point of the degeneracy, where E.0/1 > E

.0/ 2 .

We ascertained that in this case the lower curve describes the energy of a state, which is mostly j2i, while the state whose energy belongs to the upper curve is mostly j1i. At the point of avoided crossing, the eigenvectors describing the state of the system consist of both j1i and j2i in equal proportions. Now let’s keep moving along the lower curve, which means that we are turning the dial and experimentally gradually changing our control parameter. After we will have passed the point of avoided crossing, the relation between initial energy levels has changed: now we have E.0/1 < E

.0/ 2 . Now, the main contribution to the superposition represented by

the points on the lower curve comes from the state j1i,1 and if I move the system far enough from the avoided crossing point, I will have a state mostly consisting of the state j1i. Now think about it: we started with the state of the system being predominantly j2i, and by continuously changing our parameter, we transformed this state in the one which is now predominantly state j1i. This is better than any Hogwarts style transformation wizardry simply because it is not a magic and not an illusion—just honest to earth quantum mechanics!

1Recall a comment I made at the end of the discussion of the limit ˇ̌ ˇE.0/1 � E.0/2

ˇ̌ ˇ � jV12j.

10.2 Two-Level System in a Harmonic Electric Field: Rabi Oscillations 335

10.2 Two-Level System in a Harmonic Electric Field: Rabi Oscillations

Now let me switch gears and allow the perturbation operator OV to become a function of time. More specifically, I will assume that perturbation matrix elements V12 and V21 have the following form:

V21.t/ D V12.t/ D E cos�t;

where E is real. This form of the perturbation describes, for instance, a dipole interaction between a two-level system and a harmonic electric field and appears in many realistic situations. The Hamiltonian of the system in this case reads

OH D E.0/1 j1i h1j C E.0/2 j2i h2j C E cos�t .j1i h2j C j2i h1j/ : (10.12)

This is the first time you are dealing with an explicitly time-dependent Hamiltonian in the Schrödinger picture, and this requires certain adjustments in the way of thinking about the problem. First of all, you have to accept the fact that you cannot present solutions in the form of exp .�iEt=„/ j i, with j i being an eigenvector of the Hamiltonian. Equation OH j i D E j i with time-dependent Hamiltonian and time-independent j i does not make sense anymore. In other words, the stationary states do not exist in the case of time-dependent Hamiltonians, and we need, therefore, a new way of solving the time-dependent Schrödinger equation. No one can forbid you, however, to use eigenvectors of any time-independent Hamiltonian as a basis, because basis is a basis regardless of the properties of the Hamiltonian. The choice of the basis is determined solely by the reason of convenience, and it is especially convenient in this case to use eigenvectors of OH0 presented by vectors j1i and j2i. Thus, let me present the unknown time-dependent state vector j .t/i as a linear combination of these vectors:

j .t/i D a1.t/ exp

� iE .0/ 1 t

„

! j1i C a2.t/ exp

� iE

.0/ 2 t

„

! j2i (10.13)

with some unknown coefficients a1;2. This expression reminds very much Eq. 4.15 for a general solution with a time-independent Hamiltonian but with two significant differences—first, the basis used in Eq. 10.13 is not formed by eigenvectors of the total Hamiltonian OH;which does not have eigenvectors (at least not in a regular sense of the word), and second, the expansion coefficients now are unknown functions of time, while their counterparts in Eq. 4.15 were constants. You might wonder at this point if I am allowed to separate the exponential factors characteristic of the time dependence of the stationary states. A simple answer is: “Why not?” As long as I allow for the yet undetermined time dependence of the residual coefficients, I can factor out any time-dependent function I want. It will affect the equations, which these coefficients obey, but not the final result. The most meticulous of you might

336 10 Two-Level System in a Periodic External Field

also ask that even if it is allowed to pull out these factors, why bother doing it? This is a more valid question, which deserves a more detailed answer. Let me begin by saying that I did not have to do it: the earth would not stop in its tracks if I did not, and we would still solve the problem. However, by doing so, I reflect a somewhat deeper understanding of two distinct sources of the time dependence of the vector states. One is a trivial dependence given by these exponential factors, which would have existed even if the Hamiltonian did not depend on time. These exponential factors have nothing to do with the time dependence of the Hamiltonian. Factoring them out right away, I ensure that the remaining time dependence of the coefficients reflects only genuine nontrivial dynamics. As an extra bonus, I hope that by doing so, I will arrive at equations that are easier to analyze.

Substitution of Eq. 10.13 to the left-hand side of the Schrödinger equation i„d j i =dt yields

i„d j i dt

D E.0/1 a1.t/ exp

� iE .0/ 1 t

„

! j1i C i„da1.t/

dt exp

� iE

.0/ 1 t

„

! j1i C

E.0/2 a2.t/ exp

iE.0/2 t

„

! j2i C i„da2.t/

dt exp

� iE

.0/ 2 t

„

! j2i : (10.14)

The right-hand side of this equation, OH j i , with OH defined by Eq 10.12 and j i by Eq. 10.13 becomes

OH j i D E.0/1 a1.t/ exp

� iE .0/ 1 t

„

! j1i C E.0/2 a2.t/ exp

� iE

.0/ 2 t

„

! j2i C

Ea2.t/ cos�t exp

� iE .0/ 2 t

„

! j1i C Ea1.t/ cos�t exp

� iE

.0/ 1 t

„

! j2i (10.15)

where I took into account the orthogonality of the basis states. Equating coefficients in front of vectors j1i and j2i on the left- and right-hand sides of the Schrödinger equation (Eqs. 10.14 and 10.15 correspondingly) results in differential equations for the time-dependent coefficients a1;2.t/:

i„da1.t/ dt

D Ea2.t/ cos�t exp 0 @ i h E.0/1 � E.0/2

i t

„

1 A (10.16)

i„da2.t/ dt

D Ea1.t/ cos�t exp 0 @�

i h E.0/1 � E.0/2

i t

„

1 A : (10.17)

10.2 Two-Level System in a Harmonic Electric Field: Rabi Oscillations 337

Factors exp � ˙i h E.0/1 � E.0/2

i t=„ �

on the right-hand side in these equations

appeared as a result of eliminating the corresponding exponential factors

exp � �iE.0/1;2t=„

� from their left-hand sides. Note that energy eigenvalues appear

in these equations only in the form of their difference, which is just another manifestation of the already mentioned fact that the absolute values of the energy levels are irrelevant. To simplify the notations, let me introduce a so-called transition frequency:

!12 D E .0/ 1 � E.0/2

„ (10.18)

where I again for concreteness assumed that E.0/1 � E.0/2 > 0. Introducing this notation and replacing cos�t by the sum of the respective exponential functions, I can rewrite Eqs. 10.16 and 10.17 in the following form:

i„da1.t/ dt

D 1 2 Ea2.t/ .exp Œi .!12 ��/ t�C exp Œi .!12 C�/ t�/ (10.19)

i„da2.t/ dt

D 1 2 Ea1.t/ .exp Œ�i .!12 ��/ t�C exp Œ�i .!12 C�/ t�/ (10.20)

Equations 10.19 and 10.20 cannot be solved analytically. However, the most interesting phenomena described by these equations occur when !12 � � � !12 C �, in which case I can introduce an effective approximation capturing the most important properties of the model (obviously, something will be left out, and there might be situations when this something becomes important, but I am going to pretend that such situations do not concern me at all). In order to formulate this approximation, it is convenient to introduce a parameter 4 D !12 � � called frequency detuning. In the case of the small detuning, the two exponential terms in Eqs. 10.19 and 10.20 change with time on significantly different time scales. Terms containing !12 �� oscillate with a much larger period (much slower) as compared to the terms containing !12 C�, which exhibit comparatively fast oscillations.

In order to understand why fast oscillations are not effective in influencing the behavior of the system, imagine a regular pendulum acted upon by a force, which changes its direction faster than the pendulum manages to react to it (it is called inertia, in case you forgot, and it takes some time for any quantity to change by any appreciable amount). What will happen to the pendulum in this case? Right before it has any chance to move in the initial direction of the force, the force will have already changed and push the pendulum in the opposite direction. This is a very frustrating situation, so the pendulum will just stay where it is. This effect in a scientific jargon is called self-averaging—the force changes so much faster than the reaction time of the pendulum that it effectively averages itself out to zero. Taking advantage of this self-averaging effect, I will drop the fast-changing terms in Eqs. 10.19 and 10.20, turning them into

338 10 Two-Level System in a Periodic External Field

i„da1.t/ dt

D 1 2 Ea2.t/ exp .i4t/ (10.21)

i„da2.t/ dt

D 1 2 Ea1.t/ exp .�i4t/ : (10.22)

Differentiating the first of these equations with respect to time, I get

i„da 2 1.t/

dt2 D 1 2 E da2.t/

dt exp .i4t/C 1

2 i4Ea2.t/ exp .i4t/ :

Now, taking da2=dt from Eq. 10.22 while expressing a2.t/ in terms of da1=dt using Eq. 10.21, I am getting rid of coefficient a2 and derive an equation containing only a1:

da21.t/

dt2 � i4da1.t/

dt C 1 4„2 E

2a1.t/ D 0: (10.23)

Did you notice how the time-dependent exponents in Eq. 10.23 magically disap- peared turning it into a regular linear differential equation of the second order with constant coefficients? You might notice that this is the same equation which describes (among other things) a motion of a damped harmonic oscillator with damping represented by a term with the first time derivative. This might appear a bit troublesome, because the motion of a damped harmonic oscillator is characterized by exponential decay of the respective quantities with time, and this is not the behavior which we would like our quantum state to have. However, before going into a panic mode, look at the equation a bit more carefully, and then you might notice that “the damping” coefficient (whatever appears in front of da1=dt) is purely imaginary, so no real damping takes place, and you can breathe easier.

Damping or no damping, I know that equations of the type of Eq. 10.23 are solved by an exponential function, which I choose in the form of exp .i!t/. Substitution of this function into Eq. 10.23 yields an equation for the yet unknown parameter !:

!2 � 4! � 1 4 �2R D 0;

where I introduced a new quantity of the dimension of frequency

�R D E„ ; (10.24)

which plays an important role in the phenomena we are about to uncover. The quadratic equation for ! has two solutions:

!˙ D 1 2

4 ˙ 1 2

q 42 C�2R (10.25)

10.2 Two-Level System in a Harmonic Electric Field: Rabi Oscillations 339

(both of which are, by the way, real) so that the general solution to Eq. 10.23 takes the form

a1 D A exp .i!Ct/C B exp .i!�t/ : (10.26)

Expression for the second coefficient, a2, is found using Eq. 10.21:

a2 D 2i„E exp .�i4t/ da1.t/

dt D

� 2 �R

exp .�i4t/ ŒA!C exp .i!Ct/C B!� exp .i!�t/� :

Combining exponential functions in this equation, you might notice the emergence of two frequencies, !C � 4 and !� � 4, which can be evaluated into

!C � 4 D �1 2

4 C 1 2

q 42 C�2R D �!�

!� � 4 D �1 2

4 � 1 2

q 42 C�2R D �!C

allowing you to write an expression for a2 as

a2 D � 2 �R

ŒA!C exp .�i!�t/C B!� exp .�i!Ct/� : (10.27)

Amplitudes A and B in Eqs. 10.26 and 10.27 are yet undetermined; to find them I have to specify initial conditions for Eqs.10.21 and 10.22, the issue which I have not even mentioned yet. At the same time, you are perfectly aware that any problem involving a time evolution is not complete without initial conditions, which in quantum mechanics mean a state of the system at some instant of time defined as t D 0.

It is usually assumed in this type of problems that one can “turn on” the time- dependent interaction at some instant determined by the will of the experimentalist, and in many cases it does make sense. For instance, the time-dependent term in Hamiltonian 10.12 can represent a laser beam, which you can, indeed, turn on and off at will. In this case one can prepare the system to be in a specific state before the laser is turned on and study how this state will evolve due to the interaction with the laser radiation. It is simplest to prepare the system in the lowest energy stationary state, and so this is what I will choose as the initial condition:

j .0/i D j2i :

Taking into account Eq. 10.13, I can translate it into the following initial conditions for the dynamic variables a1 and a2:

340 10 Two-Level System in a Periodic External Field

a1.0/ D 0 (10.28) a2.0/ D 1: (10.29)

Substituting t D 0 into Eqs. 10.26 and 10.27 and using Eqs. 10.28 and 10.29, I derive the following equations for amplitudes A and B:

A C B D 0I

� 2 �R

ŒA!C C B!�� D 1;

which are easily solved to yield

A D �B D � �R 2 .!C � !�/ :

It is easy to see using Eq. 10.25 that

!C � !� D q

42 C�2R;

so that the amplitudes take on the value

A D �B D � �R 2

q 42 C�2R

:

Having found A and B, I can write down the final solutions for the time-dependent coefficients a1;2.t/:

a1 D �R 2

q 42 C�2R

Œexp .i!�t/ � exp .i!Ct/� (10.30)

a2 D 1q 42 C�2R

Œ!� exp .�i!Ct/ � !C exp .�i!�t/� : (10.31)

These equations formally solve the problem I set out for you to solve: you now know the time-dependent state of the two-level system described by Hamil- tonian 10.12 at any instant of time. But I wouldn’t blame you if you still have this annoying gnawing feeling of not being quite satisfied, probably because you are not quite sure what to do with this solution and what kind of useful physical information you can dig out from it. Indeed, the standard interpretation of coefficients in expres- sions similar to Eq. 10.13 as probability amplitudes, whose squared absolute values yield the probability of obtaining a corresponding value of an observable whose eigenvectors are used as a basis, wouldn’t work here. The problem is that we are

10.2 Two-Level System in a Harmonic Electric Field: Rabi Oscillations 341

using the basis provided by eigenvectors of the Hamiltonian of a system, which does not exist anymore, so that this traditional interpretation does not make much sense.

One way to make sense out of Eqs. 10.26 and 10.27 is to recognize that in a typical experiment, the time-dependent interaction does not last forever—it starts at some instant, which you can designate as t D 0, and it usually ends at some time t D tf (for instance, when a graduate student running the experiment gets tired, turns the laser off, and goes on a date). So, after the time-dependent part of the Hamiltonian vanishes, you are back to the standard situation, but the system is now in a superposition state defined by the values of the coefficients a1;2 at the time, when the laser got switched off. Now, you can quickly take the measurement of the energy and interpret the results in terms of probabilities of getting one of two values:

E.0/1 or E .0/ 2 . The probability p

� E.0/1

� that the measurement would yield E.0/1 is given

as usual by ja1j2, which according to Eq. 10.30 is

p �

E.0/1

� D �

2 R

4 �42 C�2R

� �2 � exp �i .!C � !�/ tf � � exp ��i .!C � !�/ tf

�� D

�2R

2 �42 C�2R

� �1 � cos .!C � !�/ tf � D �

2 R�42 C�2R

� sin2 !C � !� 2

tf D

�2R�42 C�2R � sin2

q 42 C�2R 2

tf : (10.32)

The probability that this measurement would yield value E.0/2 could have been computed in exactly the same manner, and I will give you a chance to do it, as an exercise, but here I will be smart and take advantage of the fact that

p �

E.0/1

� C p

� E.0/2

� D 1, so that without much ado, I can present you with

p �

E.0/2

� D 1 � �

2 R�42 C�2R

� sin2 q

42 C�2R 2

tf D

42�42 C�2R � sin2

q 42 C�2R 2

tf C cos2 q

42 C�2R 2

tf : (10.33)

Equations 10.32 and 10.33 create a clear physical picture of what is happening with our system. The first thing to note is the periodic oscillations of the probabilities with

time with frequency �GR D q

42 C�2R called generalized Rabi frequency (note that the factor 1=2 in the arguments of the cos and sin functions in these equations is the result of transition from cos x to the functions of x=2 and is, therefore, not included into the definition of the frequency). There exist special times tfn D

342 10 Two-Level System in a Periodic External Field

Fig. 10.2 Oscillations of

p �

E.0/1

� for three values of

the detuning: 4 D 0; 4=�R D 0:5, and 4=�R D 1:5

�n=�GR, where n is an integer, when the probability that the system will be found in the higher energy state is zero, and there are times tfn D �n=�GRC�=2when this probability acquires its maximum value �2R=

�42 C�2R � . For probability p

� E.0/2

� ,

the situation is reversed—the probability reaches its value of unity at certain times tfn D �n=�GR, but its minimum value occurring at tfn D �n=�GRC�=2 is not zero, but is equal to 42= �42 C�2R

� . Figure 10.2 depicts these oscillations of probability

known as Rabi oscillations. The period of these oscillations as well as maximum and minimum values of the corresponding probabilities depend on the detuning parameter 4 controlled by the experimentalists. For large detuning 4 �R, the frequency of oscillations is determined mostly by 4, but their swing (a difference between largest and smallest values) diminishes. For instance, the largest value of

p �

E.0/1

� becomes of the order of �2R=42 � 1, while the smallest value of p

� E.0/2

�

is in this case very close to unity: 1 � �2R=42. For both probabilities there are not much oscillations to speak of. A more interesting situation arises in the case of small detuning, with the special case of zero detuning being of most interest. The frequency of Rabi oscillations in this case becomes smallest and is equal to �R, which is called Rabi frequency, and the probabilities swing between exact zero and exact unity becoming the most pronounced.

If you are interested how one can observe Rabi oscillations, here is an example of how it can be done. Imagine that you subject an ensemble of two-level systems to a strong time-periodic electric field with small detuning and turn it off at different times. The timing of the switching-off will determine the probability of a two-level system to be in the higher energy state. The fraction of the systems in the ensemble in this state is proportional to the corresponding probability. The systems will eventually undergo transition to the lower energy level and emit light. The intensity of the emitted light will be proportional to the number of systems in the upper state and will change periodically with the switching-off time. In real experiments there is no actual need to turn the electric field on and off all the time because spontaneous transitions of the system from upper to lower energy states happen even with the electric field on, and when this transition happens, the system is kicked off its normal dynamic so hard that it forgets everything about what was happening to it before

10.3 Problems 343

that, so that the whole process starts anew. These kicks serve effectively as switches for the electric field. Oscillations in this case can be observed as functions of Rabi frequency controlled by the strength of the applied electric field. It is important that Rabi oscillations can be observed only if their period is shorter than the time interval between the “kicks.” To fulfill this condition, the applied electric field must be strong enough to yield oscillations with a sufficiently high frequency.

10.3 Problems

Problems for Sect. 10.1

Problem 136 Find the approximate expression for energy levels of a two-level system with a time-independent perturbation in the limit jV12j � jE1 � E2j for two cases: E1 > E2 and E1 < E2.

Problem 137 Assume that the perturbation part of the Hamiltonian is given by OV D OzE , where E is the electric field and Oz is the respective coordinate operator. Assume also that the wave functions of the states included in the Hamiltonian are described (in the coordinate representation) by wave functions defined on the one- dimensional interval �1 < z < 1:

hzj E1i D 1paB exp .� jzj =aB/

hzj E2i D q

2

a3B z exp .� jzj =aB/ ;

where aB is the Bohr radius for an electron in the hydrogen atom, and that the electric field is given in terms of the binding energy of the hydrogen atom Wb and electron’s charge e as E D Wb=eaB. The unperturbed energy levels are given as

E1 D Wb.1C u/ E2 D Wb.1 � u/;

where u is a dimensionless parameter that can be changed between values �2 and 2.

1. Find the perturbation matrix Vij. 2. Find the eigenvalues of the full Hamiltonian, and plot them as a function of the u. 3. Also find the eigenvectors of the full Hamiltonian, and plot the ratio of the relative

weights of the initial vectors jE1;2i to the both found eigenvectors as functions of u.

4. Also, consider phases of the ratio of the coefficients c1=c2 for both eigenvectors, and plot their dependence on parameter u.

344 10 Two-Level System in a Periodic External Field

5. In all plots pay special attention to the region around u D 0, and describe how the behavior of the eigenvalues of the perturbed Hamiltonian differs from the corresponding behavior of the unperturbed energies.

6. Describe the behavior of the absolute values and the phases of c1=c2 in the vicinity of u D 0.

Problems for Sect. 10.2

Problem 138 Find the probability that the measurement of the energy will yield value E2 directly from the coefficient a2 in Eq. 10.33, and verify that the expression for this probability given in the text is correct.

Problem 139 Find the time dependence of the probabilities p.E1;2/ assuming that at time t D 0 the system was in the state j1i.

Chapter 11 Non-interacting Many-Particle Systems

11.1 Identical Particles in the Quantum World: Bosons and Fermions

Quantum mechanical properties of a single particle are an important starting point for studying quantum mechanics, but in real experimental and practical situations, you will rarely deal with just a single particle. Most frequently you encounter systems consisting of many (from two to infinity) interacting particles. The main dif- ficulty in dealing with many-particle systems comes from a significantly increased dimensionality of space, where all possible states of such systems reside. In Sect. 9.4 you saw that the states of the system of two spins belong to a four-dimensional spinor space. It is not too difficult to see that the states of a system consisting of N spins would need a 2N-dimensional space to fit them all. Indeed, adding each new spin 1=2 particle with two new spin states, you double the number of basis vectors in the respective tensor product, and even the system of as few as ten particles inhabits a space requiring 1024 basis vectors. More generally, imagine that you have a particle which can be in one of M mutually exclusive states, represented obviously by M mutually orthogonal vectors (I will call them single-particle states), which can be used as a basis in this single-particle M-dimensional space. You can generate a tensor product of single-particle spaces by stacking together M basis vectors from each single-particle space. Naively you might think that the dimension of the resulting space will be MN , but it is not always so. The reality is more interesting, and to get the dimensionality of many-particle states correctly, you need to dig deeper into the concept of identity of quantum particles.

In the classical world, we know that all electrons are the same—the same charge and the same mass—but if necessary we can still distinguish between them saying that this is an electron with such and such initial coordinates and initial velocity, and therefore, it follows this particular trajectory. A second electron, which is exactly the same as the first one, but starting out with different initial conditions, follows its own trajectory. And even if these two electrons interact, scatter off each other, we can still

© Springer International Publishing AG, part of Springer Nature 2018 L.I. Deych, Advanced Undergraduate Quantum Mechanics, https://doi.org/10.1007/978-3-319-71550-6_11

345

346 11 Non-interacting Many-Particle Systems

Fig. 11.1 Two distinguishable classical electrons interact with each other and follow their own distinguishable trajectory. We can easily say which electron follows which trajectory

Fig. 11.2 Propagating clouds of probabilities representing the particles. In the interaction region, the clouds overlap, and the individuality of the particles is lost (Warning: it is dangerous to take this cartoon too seriously!)

say which electron is which by following their trajectories (see Fig. 11.1). Thus, we say that classical electrons, even though they are identical, are still distinguishable.

The situation changes when you are talking about quantum particles. In essen- tially the same setup—two particles approach each other from opposite directions, interact, and move each in its new direction—the situation becomes completely different. Now instead of two well-localized particles with perfectly defined tra- jectories, you are dealing with moving amorphous clouds of probabilities, and when they approach each other and overlap, all you can measure is the probability to find one or two particles within a certain region of space, but you have no means to tell which of the observed particles is which (Fig. 11.2). In quantum mechanics the individuality of particles is completely lost—they are not just identical, but they are indistinguishable.

Now the questions arise: how to formally describe this indistinguishability, and what are the observable consequences of this property? To begin, let me formally assign numbers 1 and 2 to the two particles and assume that particle 1 is in the state described by vector

ˇ̌ ˛.1/

˛ , where ˛ indicates a particular quantum state and 1

assigns this state to the first particle, and the second particle is in the state ˇ̌ ˇ.2/

˛ .

The space of the two-particle states can be generated by the tensor product of the single-particle states with a two-vector basis:

11.1 Identical Particles in the Quantum World: Bosons and Fermions 347

ˇ̌ ˇ .tp/1

E D ˇ̌˛.1/˛ ˇ̌ˇ.2/˛ (11.1)

ˇ̌ ˇ .tp/2

E D ˇ̌˛.2/˛ ˇ̌ˇ.1/˛ ; (11.2)

where the second vector is obtained from the first by replacing particle 1 with particle 2: This operation can be formally described by a special “exchange” operator OP.1; 2/ whose job is to interchange indexes of the particles assigned to each state:

ˇ̌ ˛.2/

˛ ˇ̌ ˇ.1/

˛ D OP.1; 2/ ˇ̌˛.1/˛ ˇ̌ˇ.2/˛ :

When applied twice, this operator obviously leaves the initial vector intact (two exchanges 1 ! 2, 2 ! 1 are equivalent to no exchange at all), meaning that OP2.1; 2/ D OI . An immediate consequence of this identity is that eigenvalues of

this operator are either equal to 1 or �1. Using the exchange operator, the concept of indistinguishability can be for-

mulated in a precise and formal way. Consider an arbitrary state of two particles represented by vector j .1; 2/i. If particles 1 and 2 are truly indistinguishable, then vector j .2; 1/i D OP.1; 2/ j .1; 2/i and initial vector j .1; 2/i must represent the same state, which means that they can differ from each other only by a constant factor. The formal representation of the last statement looks like this:

OP.1; 2/ j .1; 2/i D j .2; 1/i D � j .1; 2/i ; (11.3)

which makes it clear that if j .1; 2/i represents a state of indistinguishable particles, it must be an eigenvector of OP.1; 2/. The remarkable thing about this conclusion is that there are only two types of such eigenvectors—those corresponding to eigenvalue 1 and those that belong to eigenvalue �1, i.e., any vector describing a state of indistinguishable particles must belong to one of two classes: symmetric (even) with respect to the exchange of the particles when

OP.1; 2/ j .1; 2/i D j .1; 2/i (11.4)

or antisymmetric (odd) if

OP.1; 2/ j .1; 2/i D � j .1; 2/i : (11.5)

Moreover, Hamiltonians of indistinguishable particles obviously do not change when the particles are exchanged (otherwise they wouldn’t be indistinguishable), which means that the exchange operator and the Hamiltonian commute:

h OH; OP.1; 2/ i

D 0 (11.6)

348 11 Non-interacting Many-Particle Systems

(if it is not clear where it comes from, check the discussion around the parity operator in Sect. 5.1, where similar issues were raised). In the context of the exchange operator, Eq. 11.6 signifies two things. First is that the Hamiltonian and the exchange operator are compatible and share a common system of eigenvectors. In other words, eigenvectors of a Hamiltonian of two indistinguishable particles must be either symmetric or antisymmetric. Second, if you treat the exchange operator as a representative of a specific observable that takes only two values depending on the symmetry of the state, Eq. 11.6 indicates that the expectation value of this observable does not change with time (see Sect. 4.1.3 and the discussion around Eq. 4.17 there). Accordingly, Eq. 11.6 ensures that if a two-particle system starts out in a symmetric or antisymmetric state, it will remain in this state forever.

While it is useful to know that all states of indistinguishable particles must belong to one of the two symmetry classes and that a system put in a symmetry class at some instant of time will stay in this class forever, we still do not know how to relate the symmetry of the states of the particular system of particles to their other properties: does it depend on the particle’s charges, masses, and potential they are moving on, or can it be somehow created deliberately through a clever measurement process? I personally find the answer to all these questions, which I am about to reveal to you, quite amazing: the symmetry of any state of indistinguishable particles cannot be “chosen” or changed; the particles are born with predestined fate to be only in states with one or another symmetry predetermined by their spin. It turns out that particles with half-integer spin can exist only in antisymmetric states, while particles with integer spins can be only in symmetric states. This statement is called a spin-statistics theorem and is, in my view, one of the most amazing fundamental results, which follows purely mathematically from the requirement that quantum mechanics agrees with the relativity theory. Just stop to think about it: quantum mechanics deals with phenomena occurring at very small spatial, temporal, mass, and energy scales, while relativity theory explains the behavior of nature at very large velocities. Apparently, quantum mechanics and relativity overlap when the interaction between light and matter is involved and, at high energies, when particle- antiparticle phenomena become important. However, the theoretical requirements of the self-consistency of the theory, one of which is the spin-statistics theorem, are felt well outside of these overlap areas and penetrate all of quantum mechanics from atomic energy structure to electric, magnetic, and optical properties of solids and fluids. Two better known phenomena made possible by the spin-statistics connection are superfluidity and superconductivity. The proof of this theorem relies heavily on quantum field theory, and you will sleep better at night by just accepting it as one of the axioms of quantum mechanics.

The spin-statistics theorem was proved by Wolfgang Pauli in 1939 but published only in 1940. The footnote to Pauli’s paper in Physical Review states that the paper is a part of the report prepared for the Solvay Congress 1939, which did not take place because of the war in Europe. By the time of that publication, Pauli had moved from Zurich to Princeton, because Switzerland rejected his request for Swiss citizenship on the ground of him becoming a German citizen after Hitler annexed his native Austria. Anyway, to finish with this theorem, I only need to mention

11.1 Identical Particles in the Quantum World: Bosons and Fermions 349

that the particles with half-integer spins are called fermions (in honor of Enrico Fermi, an Italian physicist, who had to leave Italy after Mussolini came to power; he moved to the USA, where he created the world’s first nuclear reactor and played a crucial role in the Manhattan Project), while particles with whole spins are called bosons after Indian physicist Satyendra Nath Bose, who worked on the system of indistinguishable photons as early as in 1924. It is interesting that after an initial attempt to publish his paper on this topic failed, Bose sent it to Einstein asking Einstein’s opinion and assistance with publication. Einstein translated the paper into German and published in the leading German physics journal of the time Zeitschrift für Physik (under Bose’s name, of course).

Before getting back to the business of doing practical quantum mechanics with many-particle systems, a few additional words about fermions and bosons might be useful. Among elementary particles constituting regular matter, fermions are most abundant: electrons, protons, and neutrons—the main building blocks of atoms, molecules, and the rest of the material world are all spin 1=2 fermions. The only elementary boson you would encounter in regular setting would be a photon— a quantum of the electromagnetic field—and not a regular material particle. This can be taken as a general rule—as long as we are talking about elementary particles, the matter is represented by fermions, while the interaction fields, and other objects, which would classically be presented as waves in quantum mechanics, become bosons. Other examples of bosons you can find are quantized elastic waves (phonons) or quantized magnetic waves (magnons).

However, the concept of fermions and bosons can be extended to composite particles as long as the processes they are taking part in do not change their internal structure. The most famous examples of such composite particles are electron Cooper pairs (Cooperons, named after American physicist Leon Cooper who discovered them in 1956), responsible for superconductivity phenomenon, and He4 nuclei, which in addition to two mandatory protons contain two neutrons, making the total number of particles in the nucleus equal to 4. In both these examples, we are dealing with composite bosons. Indeed, a pair of electrons, as you already know, can be in the state with total spin either 0 or 1, i.e., the spin of the pair is in either case integer. In the case of He4 nucleus, there are four spin 1=2 particles, and by diving them into two pairs, you can also see that the total spin of this system can again only be an integer. Another interesting example of composite bosons is an exciton in semiconductors, which consists of an electron and a hole,1 both with spin 1=2: The extent to which the inner structure in all these examples can be neglected and the particles can be treated as bosons depends on the amount of energy required to disintegrate them into their constituent parts. For Cooperons this energy is quite small—of the order of 10�3 eV—which explains

1Energy levels in a semiconductor are organized in bands separated by large gaps. The band, all energy levels of which are filled with electrons, is called a valence band, and the closest to its empty band is a conduction band. When an electron gets excited from the valence band, the conduction band acquires an electron, and a valence band losses an electron, which leaves in its stead a positively charged hole. Here you have an electron-hole pair behaving as real positively and negatively charged spin 1=2 particles.

350 11 Non-interacting Many-Particle Systems

why they can only survive at very low (below 10 K) temperatures; exciton binding energies vary over a rather large range between several millielectronvolts and several hundred millielectronvolts, depending upon the material, and they, therefore, survive at temperatures between 10 K and the room temperature (300 K). He4 nucleus, of course, is the most stable of all the composite particles discussed: it takes the whole 28:3 mega-electronvolts to take it apart.

After this short and hopefully entertaining detour, it is time to get back to business of figuring out how to implement the requirements of the spin-statistics theorem in practical calculations. A generic vector representing a two-particle state and expressed as a linear combination of basis vectors 11.1 and 11.2

j .1; 2/i D a1 ˇ̌ ˛.1/

˛ ˇ̌ ˇ.2/

˛C a2 ˇ̌ ˛.2/

˛ ˇ̌ ˇ.1/

˛

with arbitrary coefficients a1;2 does not obey the required symmetry condition. However, after a few minutes of contemplation and silent staring at this expression, you will probably see that you can satisfy the symmetry requirements of Eq. 11.4 by choosing a1 D a2, while Eq. 11.5 can be made happy with the choice a1 D �a2. (If you are not that big on contemplation, just switch the particles in the expression for j .1; 2/i, and write down the conditions of Eq. 11.4 or 11.5 explicitly.) If in addition to symmetry you want your two-particle states to be also normalized, you can choose for fermions

ˇ̌ f .1; 2/

˛ D 1p 2

�ˇ̌ ˛.1/

˛ ˇ̌ ˇ.2/

˛ � ˇ̌˛.2/˛ ˇ̌ˇ.1/˛� (11.7)

and for bosons

ˇ̌ ˇ .1/b .1; 2/

E D 1p

2

�ˇ̌ ˛.1/

˛ ˇ̌ ˇ.2/

˛C ˇ̌˛.2/˛ ˇ̌ˇ.1/˛� : (11.8)

While Eq. 11.7 exhausts all possible two-particle states for fermions, in the case of bosons, two more states, in which different particles occupy the same single-particle state, can be constructed:

ˇ̌ ˇ .2/b .1; 2/

E D ˇ̌˛.1/˛ ˇ̌˛.2/˛ (11.9)

ˇ̌ ˇ .3/b .1; 2/

E D ˇ̌ˇ.1/˛ ˇ̌ˇ.2/˛ : (11.10)

An attempt to arrange a similar state for fermions fails because you cannot have an antisymmetric expression with two identical states—they simply cancel each other giving you a zero. In other words, it is impossible to have a two-particle state of fermions, in which each fermion is in the same single-particle state. This is essentially an expression of famous Pauli’s exclusion principle, which Pauli formulated in 1925 trying to explain why atoms with even number of electrons are more chemically stable than atoms with odd electron numbers. He realized that this

11.2 Constructing a Basis in a Many-Fermion Space 351

can be explained requiring that there can only be one electron per single-electron state. If one takes into account only orbital quantum numbers such as principal number n, orbital number l < n, and magnetic number jmj � l (see Chap. 8), the total number of available states is equal to n2, which does not have to be even. So, Pauli postulated the existence of yet another quantum quantity, which can only take two different values, making the total amount of quantum numbers characterizing a state of an electron in atom equal to 4 and the total number of available states 2n2. The initial formulation of this principle was concerned only with electrons and was stated approximately like this: no two electrons in a many-electron atom can have the same values of four quantum numbers. Despite the success of this principle in explaining the periodic table, Pauli remained unsatisfied for two principal reasons: (a) he had no idea which physical quantity the fourth quantum number represents, and (b) he was not able to derive his principle from more fundamental postulates of quantum mechanics. The first of his concerns was resolved with the emergence of the idea of spin (see Sect. 9.1), but it took him 14 long years to finally prove the spin-statistics theorem, of which his exclusion principle is a simple corollary.

Before continuing I would like to clear up one terminological problem. When dealing with many-particle systems, the word “state” might have different meanings when used in different contexts. On one hand, I will talk about states characterizing the actual many-particle system; Eqs. 11.7 through 11.10 give examples of such states for the two-particle system. On the other hand, I use single-particle states,

such as ˇ̌ ˛.1/

˛ or ˇ̌ ˇ.2/

˛ , to construct the many-particle states

ˇ̌ f .1; 2/

˛ or ˇ̌ ˇ .i/b .1; 2/

E .

So, in order to avoid misunderstandings and misconceptions, let’s agree that the term “state” from now on will always refer to an actual state of a many-particle system, while single-particle states from this point forward will be called single-particle orbitals. Understood literally orbitals are usually used to describe single-electron states of atomic electrons, but I will take the liberty to expand this term to any single-electron state. Getting this out of the way, I now want to direct your attention to the following fact. In the system of two fermions with only two available orbitals, we ended up with just a single two-particle state. At the same time, in the case of the same number of bosons and the same number of orbitals, there are three linearly independent orthogonal two-particle states, and if we were to forget about symmetry requirements (as we would if dealing with distinguishable particles), we would have ended up with a four-dimensional space of two-particle states just like in the two- spin problem from Sect. 9.4. You can see now that the dimensionality of the space containing many-particle states severely depends on the symmetry requirements, and the naive prediction for this dimension to be MN turned to be only correct for distinguishable particles.

352 11 Non-interacting Many-Particle Systems

11.2 Constructing a Basis in a Many-Fermion Space

While identical bosons are responsible for some fascinating phenomena such as superfluidity and superconductivity, the systems of many fermions are much more ubiquitous in the practical applications of quantum theory, and, therefore, I will mostly focus on them from now on. As always, the first thing to understand is the structure of the space in which vectors representing the states of interest live. This includes finding its dimension and constructing a basis. The problem of finding the dimension of a many-particle space is an exercise in combinatorics—the science of counting the number of different combinations of various objects. In the case of fermions, the problem is formulated quite simply: given N objects (particles) and M boxes (orbitals), you need to compute in how many different ways you can fill the boxes assuming that each box can hold only one particle, and an order in which the particles are distributed among the boxes is not important. Once you find one distribution of the particles among the boxes, it becomes a seed for one many-particle state. The state itself is found by permuting the particles among the boxes, adding a negative sign for each permutation and summing up the results. To understand the situation, better begin with a simplest case: M D N. When the number of particles is equal to the number of orbitals, you do not have much of a choice: you just have to put one particle in each box, and then do the permutations— you end up with a single antisymmetric state. As an example, consider three

particles that can be in one of three available orbitals ˇ̌ ˇ˛.s/i

E , where the lower index

enumerates the orbitals and the upper index refers to the particles. Assume that you put the first particle in the first box, the second particle in the second, and the third one in the third, generating the following combination of the orbitals:ˇ̌ ˇ˛.1/1

E ˇ̌ ˇ˛.2/2

E ˇ̌ ˇ˛.3/3

E . Now, let me switch particles 1 and 2, generating combination

� ˇ̌ ˇ˛.2/1

E ˇ̌ ˇ˛.1/2

E ˇ̌ ˇ˛.3/3

E . If I switch the particles again, say, particles 1 and 3, I will get

the new combination ˇ̌ ˇ˛.2/1

E ˇ̌ ˇ˛.3/2

E ˇ̌ ˇ˛.1/3

E . Note that the negative sign has disappeared

because each new permutation brings about a change of sign. Making all 6 (3Š) permutations, you will end up with a single three-particle state:

ˇ̌ ˇ˛.1/1

E ˇ̌ ˇ˛.2/2

E ˇ̌ ˇ˛.3/3

E � ˇ̌ ˇ˛.2/1

E ˇ̌ ˇ˛.1/2

E ˇ̌ ˇ˛.3/3

E C ˇ̌ ˇ˛.3/1

E ˇ̌ ˇ˛.1/2

E ˇ̌ ˇ˛.2/3

E �

ˇ̌ ˇ˛.3/1

E ˇ̌ ˇ˛.2/2

E ˇ̌ ˇ˛.1/3

E C ˇ̌ ˇ˛.2/1

E ˇ̌ ˇ˛.3/2

E ˇ̌ ˇ˛.1/3

E � ˇ̌ ˇ˛.1/1

E ˇ̌ ˇ˛.3/2

E ˇ̌ ˇ˛.2/3

E : (11.11)

In agreement with the permutation rules described above, all terms in Eq. 11.11 with negative signs in front of them can be obtained from the first term by exchanging just one pair of particles, while the terms with the positive sign are obtained by permutation of two particles. It makes sense, of course, because, as I said before, an exchange of any two fermions is complemented by a change of sign, in which case an exchange of two pairs of fermions is equivalent to changing the sign twice: C ! � ! C, which is, of course, the same argument as I made when deriving this

11.2 Constructing a Basis in a Many-Fermion Space 353

expression. Finally, if you are wondering how to choose the first, seeding, term, the answer is simple: it does not matter and you can start with any of them. The only difference, which you might notice, is that all negative terms could become positive and vice versa, which amounts to a simple overall negative sign in front of the whole expression, and this makes no physical difference whatsoever.

Now, since, for every selection of the number of boxes equal to the number of the particles, you end up with just a single many-particle state, the total number of states is simply equal to the number of ways you can select N boxes out of M. This is a classical combinatorial problem with a well-known solution given by the number of combinations for N objects chosen out of M. Thus, the number of distinct linearly independent and orthogonal N-fermion states based on M available single-fermion orbitals (the dimensionality D .N;M/ of the corresponding space) is

D.N;M/ D

M N

D MŠ

NŠ .M � N/Š : (11.12)

You can verify this general results with a few simple examples. Let’s say that now you want to build a space of three-fermion states using five available single-fermion orbitals. According to Eq. 11.12 this space possesses 5Š=.3Š2Š/ D 10 basis vectors. Using the same notation

ˇ̌ ˇ˛.s/i

E as before, but now allowing index i to run from 1 to 5,

you can generate the following ten seed vectors, in which each particle is assigned to a different orbital:

ˇ̌ ˇ˛.1/1

E ˇ̌ ˇ˛.2/2

E ˇ̌ ˇ˛.3/3

E ; ˇ̌ ˇ˛.1/1

E ˇ̌ ˇ˛.2/2

E ˇ̌ ˇ˛.3/4

E ; ˇ̌ ˇ˛.1/1

E ˇ̌ ˇ˛.2/2

E ˇ̌ ˇ˛.3/5

E

ˇ̌ ˇ˛.1/1

E ˇ̌ ˇ˛.2/3

E ˇ̌ ˇ˛.3/4

E ; ˇ̌ ˇ˛.1/1

E ˇ̌ ˇ˛.2/3

E ˇ̌ ˇ˛.3/5

E ; ˇ̌ ˇ˛.1/1

E ˇ̌ ˇ˛.2/4

E ˇ̌ ˇ˛.3/5

E

ˇ̌ ˇ˛.1/2

E ˇ̌ ˇ˛.2/3

E ˇ̌ ˇ˛.3/4

E ˇ̌ ˇ˛.1/2

E ˇ̌ ˇ˛.2/3

E ˇ̌ ˇ˛.3/5

E ; ˇ̌ ˇ˛.1/2

E ˇ̌ ˇ˛.2/4

E ˇ̌ ˇ˛.3/5

E ;

ˇ̌ ˇ˛.1/3

E ˇ̌ ˇ˛.2/4

E ˇ̌ ˇ˛.3/5

E :

Each of these seeds yields a single antisymmetric state in an exactly the same way as in the previous example.

A bit of gazing at Eq. 11.11 might reveal you an ultimate truth about the structure of this expression: a sum of products of various distinct combinations of nine

elements ˇ̌ ˇ˛. j/i

E where each index takes three different values is grouped in three with

alternating positive and negative signs. Some digging in your associative memory will bring to the surface that this is nothing but a determinant of a matrix whose rows are the three participating orbitals with different particles assigned to each row:

j˛1; ˛2; ˛3i D

���������

ˇ̌ ˇ˛.1/1

E ˇ̌ ˇ˛.2/1

E ˇ̌ ˇ˛.3/1

E ˇ̌ ˇ˛.1/2

E ˇ̌ ˇ˛.2/2

E ˇ̌ ˇ˛.3/2

E ˇ̌ ˇ˛.1/3

E ˇ̌ ˇ˛.2/3

E ˇ̌ ˇ˛.3/3

E

��������� (11.13)

354 11 Non-interacting Many-Particle Systems

where on the left of this equation I introduced a notation j˛1; ˛2; ˛3i which contains all the information you need to know about the state presented on the right, namely, that this three-fermion state is formed by distributing three particles among orbitals j˛1i, j˛2i, and j˛3i. The right-hand side of this expression gives you a good mnemonic rule about how to combine these three orbitals into an antisymmetric three-fermion state. Arranging the orbitals into determinants makes the antisymmetry of the corresponding state obvious: the exchange of particles becomes mathematically equivalent to the interchange of the columns of the determinant, and this operation is well known to reverse its sign.

The idea to arrange orbitals into determinants in order to construct automatically antisymmetric many-fermion states was first used independently by Heisenberg and Dirac in their 1926 papers and expressed in a more formal way by John C. Slater, an American physicist, in 1929, and for this reason these determinants bear his name. That was a time when American physicists had to travel for postdoctoral positions to Europe, and not the other way around, so after getting his Ph.D. from Harvard, Slater moved to Cambridge and then to Copenhagen before coming back to the USA and joining the Physics Department at Harvard as a faculty member.

A word of caution: the fermion states in the form of the Slater determinant are not necessarily the eigenvectors of a many-particle Hamiltonian, which, in general, can be presented in the form

OH.N/ D NX

iD1 OHi C 1

2

NX iD1

NX j

OVi;j (11.14)

where the first term is the sum of the single-particle Hamiltonians for each particle, which includes operators of the particle’s kinetic energy and might include a term describing the interaction of each particle with some external object, e.g., electric field, while the second term describes the interaction between the particles, most frequently the Coulomb repulsion between negatively charged electrons. The factor 1=2 in front of the second term takes into account that the double summation over i and j counts the interaction between each pair of particles twice: once as OVi;j and the second time as OVj;i. The principal difference between these two terms is that while each OHi acts only on the orbitals of “its own” particle, the interaction term acts on the orbitals of two particles. As a result, any simple tensor product of single-particle orbitals is an eigenvector of the first term of the many-particle Hamiltonian, but not of the entire Hamiltonian. Consider, for instance, the three-particle state from the previous example. Picking up just one term from Eq. 11.11, I can write

� OH1 C OH2 C OH3 � ˇ̌ ˇ˛.1/1

E ˇ̌ ˇ˛.2/2

E ˇ̌ ˇ˛.3/3

E D

ˇ̌ ˇ˛.2/2

E ˇ̌ ˇ˛.3/3

E OH1 ˇ̌ ˇ˛.1/1

E C ˇ̌ ˇ˛.1/1

E ˇ̌ ˇ˛.3/3

E OH2 ˇ̌ ˇ˛.2/2

E C

ˇ̌ ˇ˛.1/1

E ˇ̌ ˇ˛.2/2

E OH3 ˇ̌ ˇ˛.3/3

E D (11.15)

.E1 C E2 C E3/ ˇ̌ ˇ˛.1/1

E ˇ̌ ˇ˛.2/2

E ˇ̌ ˇ˛.3/3

E :

11.2 Constructing a Basis in a Many-Fermion Space 355

Since all other terms in Eq. 11.11 feature the same three orbitals, it is obvious that all of them are eigenvectors of this Hamiltonian with the same eigenvalue, so that the entire antisymmetric three-particle state given by the Slater determinant, Eq. 11.13, is also its eigenvector. It is also clear that for any Slater determinant state, the eigenvalue of the non-interacting Hamiltonian is always a sum of the single- particle energies of the orbitals used to construct the determinant. If, however, one adds the interaction term to the picture, the situation changes as none of the single- particle orbitals can be eigenvectors of OVi;j, which acts on states of two particles, so that the Slater determinants are no longer stationary states of many-fermion system. This does not mean, of course, that they are useless—they form a convenient basis in the space of many-particle states, which ensures that all states represented in this basis are antisymmetric. This brings me back to Eq. 11.12, defining the dimension of this space and highlighting the main difficulty of dealing with interacting many- particle systems—the space containing the corresponding states is just too large.

Consider, for instance, an atom of carbon, with its six electrons. You can start building the basis for the six-electron space starting with lowest energy orbitals and continuing until you have enough basis vectors. The two lowest energy orbitals correspond to principal quantum number n D 1, orbital and magnetic numbers equal to zero, and two spin numbers ˙1=2: j1; 0; 0; 1=2i and j1; 0; 0;�1=2i. This is definitely not enough for six electrons, so you need to go to orbitals with n D 2, of which there are 8: j2; 0; 0; 1=2i, j2; 0; 0;�1=2i ; j2; 1;�1; 1=2i, j2; 1;�1;�1=2i, j2; 1; 0; 1=2i, j2; 1; 0;�1=2i, j2; 1; 1; 1=2i, and j2; 1; 1;�1=2i, where the notation follows the regular scheme jn; l;m;msi (I combined the spin number ms with orbital quantum numbers for the sake of simplifying the notation). If I limit the space to just these ten orbitals (and it is not the fact that orbitals with n D 3 should not be included), the total number of basis vectors in this space will be 10Š=.6Š4Š/ D 210. It means that using the Slater determinants as a basis in this space, I will end up with the Hamiltonian of the system represented by a 210 � 210 matrix. Allowing the electrons to occupy additional n D 3 orbitals, all 18 of them, will bring the dimensionality of the six-electron space to 376,740. I hope these examples give you a clear picture of how difficult problems with many interacting particles can be and explain why people were busy inventing a great variety of different approximate ways of dealing with them. Very often, the idea behind these methods is to replace the Hamiltonian in Eq. 11.14 by an effective Hamiltonian without an interaction term. The effects of the interaction in such approaches are always hidden in “new” single-particle Hamiltonians retaining some information about the interaction with other particles. A more detailed exposition of this issue is way beyond the scope of this book and can be found in many texts on atomic physics and quantum chemistry.

Before continuing to the next section, let me consider a few examples involving non-interacting indistinguishable particles so that you could get a better feel for the quantum mechanical indistinguishability.

Example 29 (Non-interacting Particles in a Potential Well.) Consider a system of three non-interacting particles in an infinite one-dimensional potential well. Assuming that the particles are (a) distinguishable spinless atoms of equal mass

356 11 Non-interacting Many-Particle Systems

ma, (b) electrons, and (c) indistinguishable spinless bosons, find three lowest energy eigenvalues of this system, and write down the corresponding wave functions (spinors when necessary).

Solution

(a) In the case of three distinguishable atoms, no symmetry requirements can be imposed on the three-particle wave function, so the ground state energy corresponds to a state in which all three atoms are in the same single-particle ground state orbital:

.3/ 1 .z1; z2; z3/ D

s 2

L

3 sin

�z1 L

sin �z2 L

sin �z3 L

(11.16)

with corresponding energy

E1;1;1 D 3„ 2�2

2L2z ma : (11.17)

The second energy level would correspond to moving one of the atoms to the second single-particle orbital, so that I have for the three degenerate three- particle states

.3/ 2;1 .z1; z2; z3/ D

s 2

L

3 sin

2�z1 L

sin �z2 L

sin �z3 L

.3/ 2;2 .z1; z2; z3/ D

s 2

L

3 sin

�z1 L

sin 2�z2

L sin

�z3 L

(11.18)

.3/ 2;3 .z1; z2; z3/ D

s 2

L

3 sin

�z1 L

sin �z2 L

sin 2�z3

L

with the corresponding energy

E2;1;1 D 6„ 2�2

2L2z ma : (11.19)

Finally, the next lowest energy will correspond to two particles moved to the second single-particle level with the wave functions and triple-degenerate energy level given by

.3/ 3;1 .z1; z2; z3/ D

s 2

L

3 sin

2�z1 L

sin 2�z2

L sin

�z3 L

11.2 Constructing a Basis in a Many-Fermion Space 357

.3/ 3;2 .z1; z2; z3/ D

s 2

L

3 sin

�z1 L

sin 2�z2

L sin

2�z3 L

(11.20)

.3/ 3;3 .z1; z2; z3/ D

s 2

L

3 sin

2�z1 L

sin �z2 L

sin 2�z3

L

E2;2;1 D 9„ 2�2

2L2z ma : (11.21)

(b) Electrons are indistinguishable fermions, so their many-particle states must be antisymmetric. The single-particle orbitals are spinors, formed as a tensor product of the eigenvectors of the infinite potential well and of the spin operator OSz. For convenience, I will begin by writing down the single-particle orbitals in the symbolic form jn;msi, where n corresponds to an energy level in the infinite well and ms is a spin magnetic number. To construct the vector representing the ground state of the three-electron system, I need to include three different orbitals with the lowest single-particle energies. Obviously these are j1;"i ; j1;#i ; j2;msi. The choice of the spin state in the third orbital is arbitrary, so that there are two different ground states with the same energy. The respective Slater determinant becomes

j1; 1; 2i D ������ j1;"i1 j1;"i2 j1;"i3 j1;#i1 j1;#i2 j1;#i3 j2;"i1 j2;"i2 j2;"i3

������

where the lower subindex enumerates electrons, and I chose for concreteness the spin-up state for the spin portion of the third orbital. Notation j1; 1; 2i for the three-electron state was chosen in the form, which reflects the eigenvectors of the infinite potential, “occupied”2 by electrons in this state. Expanding the determinant and pulling out the spin number into a separate ket, I have

j1; 1; 2i D j1i1 j"i1 j1i2 j#i2 j2i3 j"i3 C j1i1 j#i1 j1i2 j"i2 j2i3 j"i3 C j1i1 j"i1 j2i2 j"i2 j1i3 j#i3 � j2i1 j"i1 j1i2 j#i2 j1i3 j"i3 � j1i1 j#i1 j1i2 j"i2

j2i3 j"i3 � j1i1 j"i1 j2i2 j"i2 j1i3 j#i3 :

2“Occupied” in this context means that a given orbital participates in the formation of a given many-particle state.

358 11 Non-interacting Many-Particle Systems

Bringing back the position representation of the eigenvectors of the well, the last result can be written down as

j1; 1; 2i.1/ D s

2

L

3 �

� sin �z1L 0

� 0

sin �z2L

� sin 2�z3L 0

� C

0

sin �z1L

� sin �z2L 0

� sin 2�z3L 0

� C

0

sin �z1L

� sin 2�z2L 0

� 0

sin �z3L

� �

sin 2�z1L 0

� 0

sin �z2L

� sin �z3L 0

� �

0

sin �z1L

� sin �z2L 0

� sin 2�z3L 0

� �

sin �z1L 0

� sin 2�z2L 0

� 0

sin �z3L

�� : (11.22)

To get a bit more comfortable with this expression, let’s apply operator

OH D OH.1/ C OH.2/ C OH.3/;

where OH.i/ is a single-electron infinite potential well Hamiltonian, which in the spinor representation is proportional to a unit matrix:

OH j1; 1; 2i D s

2

L

3 �

� OH.1/ sin �z1L 0

� 0

sin �z2L

� sin 2�z3L 0

� C

0 OH.1/ sin �z1L

� sin �z2L 0

� sin 2�z3L 0

� C

0

OH.1/ sin �z1L

� sin 2�z2L 0

� 0

sin �z3L

� � OH.1/ sin 2�z1L

0

� 0

sin �z2L

� sin �z3L 0

� �

0

OH.1/ sin �z1L

� sin �z2L 0

� sin 2�z3L 0

� � OH.1/ sin �z1L

0

� sin 2�z2L 0

� 0

sin �z3L

� C

sin �z1L 0

� 0

OH.2/ sin �z2L

� sin 2�z3L 0

� C

0

sin �z1L

� OH.2/ sin �z2L 0

� sin 2�z3L 0

� C

0

sin �z1L

� OH.2/ sin 2�z2L 0

� 0

sin �z3L

� �

sin 2�z1L 0

� 0

OH.2/ sin �z2L

� sin �z3L 0

� �

0

sin �z1L

� OH.2/ sin �z2L 0

� sin 2�z3L 0

� �

sin �z1L 0

� OH.2/ sin 2�z2L 0

� 0

sin �z3L

� C

sin �z1L 0

� 0

sin �z2L

� OH.3/ sin 2�z3L 0

� C

0

sin �z1L

� sin �z2L 0

� OH.3/ sin 2�z3L 0

� C

11.2 Constructing a Basis in a Many-Fermion Space 359

0

sin �z1L

� sin 2�z2L 0

� 0

OH.3/ sin �z3L

� �

sin 2�z1L 0

� 0

sin �z2L

� OH.3/ sin �z3L 0

� �

0

sin �z1L

� sin �z2L 0

� OH.3/ sin 2�z3L 0

� �

sin �z1L 0

� sin 2�z2L 0

� 0

OH.3/ sin �z3L

�� :

I understand that this expression looks awfully intimidating (or just awful), but I still want you to gather your wits and go through it line by line, and let the force be with you. The first thing that you shall notice is that every single-particle Hamiltonian affects only those orbitals that contain its own particle. Now remembering that each of the orbitals is an eigenvector of the corresponding Hamiltonian, you can rewrite the above expression as

OH j1; 1; 2i D s

2

L

3 �

� E1

sin �z1L 0

� 0

sin �z2L

� sin 2�z3L 0

� C E1

0

sin �z1L

� sin �z2L 0

� sin 2�z3L 0

� C

E1

0

sin �z1L

� sin 2�z2L 0

� 0

sin �z3L

� � E2

sin 2�z1L 0

� 0

sin �z2L

� sin �z3L 0

� �

E1

0

sin �z1L

� sin �z2L 0

� sin 2�z3L 0

� � E1

sin �z1L 0

� sin 2�z2L 0

� 0

sin �z3L

� C

E1

sin �z1L 0

� 0

sin �z2L

� sin 2�z3L 0

� C E1

0

sin �z1L

� sin �z2L 0

� sin 2�z3L 0

� C

E2

0

sin �z1L

� sin 2�z2L 0

� 0

sin �z3L

� � E1

sin 2�z1L 0

� 0

sin �z2L

� sin �z3L 0

� �

E1

0

sin �z1L

� sin �z2L 0

� sin 2�z3L 0

� � E2

sin �z1L 0

� sin 2�z2L 0

� 0

sin �z3L

� C

E2

sin �z1L 0

� 0

sin �z2L

� sin 2�z3L 0

� C E2

0

sin �z1L

� sin �z2L 0

� sin 2�z3L 0

� C

E1

0

sin �z1L

� sin 2�z2L 0

� 0

sin �z3L

� � E1

sin 2�z1L 0

� 0

sin �z2L

� sin �z3L 0

� �

E2

0

sin �z1L

� sin �z2L 0

� sin 2�z3L 0

� � E1

sin �z1L 0

� sin 2�z2L 0

� 0

sin �z3L

�� ;

where E1;2 are eigenvalues of energy corresponding to eigenvectors j1i ; j2i of the infinite potential well. Combining the like terms (terms with the same combination of single-particle orbitals), you will find

360 11 Non-interacting Many-Particle Systems

OH j1; 1; 2i D .2E1 C E2/ j1; 1; 2i :

The second eigenvector belonging to this eigenvalue can be generated by changing the spin state paired with the orbital state j2i from spin-up to spin- down, which yields

j1; 1; 2i.2/ D s

2

L

3 �

� sin �z1L 0

� 0

sin �z2L

� 0

sin 2�z3L

� C

0

sin �z1L

� sin �z2L 0

� 0

sin 2�z3L

� C

0

sin �z1L

� 0

sin 2�z2L

� 0

sin �z3L

� �

0

sin 2�z1L

� 0

sin �z2L

� sin �z3L 0

� �

0

sin �z1L

� sin �z2L 0

� 0

sin 2�z3L

� �

sin �z1L 0

� 0

sin 2�z2L

� 0

sin �z3L

�� :

To get the next energy level and the corresponding eigenvector, I just need to move one of the particles to the orbital j2i jmsi, which means that the Slater determinant is now formed by orbitals j1;msi ; j2;#i ; j2;"i with the arbitrary value of the spin state in the single-particle ground state. Using for concreteness the spin-up value in j1;msi, I can write

j1; 2; 2i.1/ D s

2

L

3 �

� sin �z1L 0

� sin 2�z2L 0

� 0

sin 2�z3L

� C

sin 2�z1L 0

� 0

sin 2�z2L

� sin �z3L 0

� C

0

sin 2�z1L

� sin �z2L 0

� sin 2�z3L 0

� �

0

sin 2�z1L

� sin 2�z2L 0

� sin �z3L 0

� �

sin 2�z1L 0

� sin �z2L 0

� 0

sin 2�z3L

� �

sin �z1L 0

� 0

sin 2�z2L

� sin 2�z3L 0

�� :

The energy corresponding to this state is E2;2;1 D E1 C 2E2 and coincides with Eq. 11.20 for energy of the second excited in the system of the distinguishable particles. Finally, to generate the next lowest energy level, one has to keep two orbitals corresponding to the second excited level of the well with different values of the spin number, and then the only choice for the third orbital would be to use one of two j3i jmsi orbitals, which result in two degenerate eigenvectors, one of which is shown below:

11.2 Constructing a Basis in a Many-Fermion Space 361

j2; 2; 3i.2/ D s

2

L

3 �

� sin 3�z1L 0

� sin 2�z2L 0

� 0

sin 2�z3L

� C

sin 2�z1L 0

� 0

sin 2�z2L

� sin 3�z3L 0

� C

0

sin 2�z1L

� sin 3�z2L 0

� sin 2�z3L 0

� �

0

sin 2�z1L

� sin 2�z2L 0

� sin 3�z3L 0

� �

sin 2�z1L 0

� sin 3�z2L 0

� 0

sin 2�z3L

� �

sin 3�z1L 0

� 0

sin 2�z2L

� sin 2�z3L 0

�� :

(I derived this expression by simply replacing sin �ziL everywhere with sin 3�zi

L .) The respective energy value is given by

E2;2;3 D E3 C 2E2 D 17„ 2�2

2L2z me :

(c) Now, let me deal with the system of three identical spinless bosons. The symmetry requirement for the three-boson system allows using all identical orbitals (the resulting state is automatically symmetric); thus, the ground state can be built of a single orbital j1i and turns out to be the same as in the case of distinguishable particles and with the same energy value (Eq. 11.17). A differ- ence from distinguishable particles arises when transitioning to excited states. Now, to satisfy the symmetry requirements, I have to turn three degenerate states of Eqs. 11.18 and 11.20 with energies given by Eqs. 11.19 and 11.21 into single non-degenerate states:

.3/ 2;1;1 .z1; z2; z3/ D

s 2

L

3 sin

2�z1 L

sin �z2 L

sin �z3 L

C

sin �z1 L

sin 2�z2

L sin

�z3 L

C sin �z1 L

sin �z2 L

sin 2�z3

L

�

.3/ 2;2;1 .z1; z2; z3/ D

s 2

L

3 sin

2�z1 L

sin 2�z2

L sin

�z3 L

C

sin �z1 L

sin 2�z2

L sin

2�z3 L

C sin 2�z1 L

sin �z2 L

sin 2�z3

L

� :

362 11 Non-interacting Many-Particle Systems

11.3 Pauli Principle and Periodic Table of Elements: Electronic Structure of Atoms

While we are not equipped to deal with systems of large numbers of interacting particles, you can still appreciate how Pauli’s idea of exclusion principle helped understand the periodicity in the properties of the atoms. In order to follow the arguments, you need to keep in mind two important points. First, when discussing the chemical properties of atoms, people are interested foremost in the many-particle ground state, i.e., a state of many electrons, which would have the lowest possible energy. Second, since the Pauli principle forbids states in which two electrons occupy the same orbital, you have to build many-particle states using at least as many orbitals as many particles are in your system, starting with ground state orbitals and adding new orbitals in a way, which would minimize an unavoidable increase of the sum of single-particle energies of all involved electrons. This last point implicitly assumes that the lowest energy of non-interacting particles would remain the lowest energy even if the interaction is taken into account. This assumption is not always true, but the discussion of this issue is beyond the scope of this book. Anyway, having these two points in mind, let’s consider what happens with states of electrons as we are moving along the periodic table. Helium occupies the second place in the first row and is known as an inert gas, meaning that it is very stable and is not eager to participate in chemical reactions or form chemical bonds. It has two electrons, and therefore you need only two orbitals, which can have the same value of the principal number n D 1 to construct a two-electron state:

j1; 0; 0; 1=2i1 j1; 0; 0;�1=2i2 � j1; 0; 0; 1=2i2 j1; 0; 0;�1=2i1 :

These two orbitals exhaust all available states with the same principal number. In chemical language, we can say the electrons in helium atom belong to a complete or closed shell. Going to the next atom, lithium Li, you will notice that it has very different chemical properties—lithium is an active alkali metal, which readily participates in a variety of chemical reactions and forms a number of different compounds gladly offering one of its electrons for chemical bonding. Three lithium electrons need more than two orbitals to form a three-electron state, so you must start dealing with orbitals characterized by principal number n D 2. There are eight of them, but only one is really required to form the lowest energy three- electron state, and as a result seven of those orbitals remain, using physicist’s jargon, “unoccupied.” Once you go along the second row of the periodic table, the number of electrons increases to four in the case of beryllium, five for boron, six for carbon, seven for nitrogen, eight for oxygen, nine for fluorine, and finally ten for neon. With an increasing number of electrons, you must add additional orbitals to be able to create corresponding many-electron states, so that the number of “unoccupied” orbitals decreases. As the number of available, unused orbitals is getting smaller, the chemical activity of the corresponding substances diminishes, until you reach another inert gas neon. To construct a many-electron state for neon

11.3 Pauli Principle and Periodic Table of Elements: Electronic Structure. . . 363

Table 11.1 Elements of the second row of the periodic table and electronic configurations of their ground states in terms of single-electron orbitals and the term symbols

Element Configuration Term symbol

Li3 1s22s1 2S1=2 Be4 1s22s2 1S0 B5 1s22s22p1 2P1=2 C6 1s22s22p2 3P0 N7 1s22s22p3 4S3=2 O8 1s22s22p4 3P2 F9 1s22s22p5 2P3=2 Ne10 1s22s22p6 1S0

with ten electrons, you have to use all ten available orbitals with n D 1 and n D 2: Consequently, the electron structure of neon is again characterized as a closed shell configuration. A popular way to visualize this process of filling up the available orbitals consists in assigning numbers 1; 2; � � � to the principal quantum number n, and letters s; p; f ; and d to orbitals with orbital angular momentum number l equal to 0; 1; 2, and 3, respectively. The configuration of helium in this notation, primarily used in atomic physics and quantum chemistry, would be 1s2, where the first number stays for the principal number, and the upper index indicates the number of electrons available for assignment to orbitals with l D 0. The electronic structure of elements in the second row of the periodic table discussed above is shown in Table 11.1.

You can see from this table that l D 0 orbitals are added first to the list of available single-electron states, and only after that additional six orbitals with l D 1 and different values of m and ms are thrown in. The supposition here is that single-electron states with l D 0 would contribute less energy than the l D 1 states3. Therefore, these two orbitals must be incorporated into the basis first. The assumption that the orbitals with larger n and larger l would contribute more energy, and, therefore, the corresponding orbitals must be added only after the orbitals with lower values of these numbers are filled, is not always correct, and for some elements orbitals with lower l and higher n contribute less energy than orbitals with higher l and lower n. This happens, for instance, with orbital 4s, which contributes less energy than the orbital 3d, but there are no simple hand-waving arguments that could explain or predict this behavior. Anyway, going now to the third row of the periodic table, you again start with the new set of orbitals characterized by n D 3, plenty of which are available for 11 electrons in the first element, another alkali metal, sodium. I think you get the gist of how it is working, but on the other hand, you shall be aware that this line of arguments is still a gross oversimplification, and periodic table of elements is not that periodic in some instances, and there are lots of elements that do not fit this simple model of closed shells.

Single-electron orbitals jn; l;m;msi based on eigenvectors of operators of orbital and spin angular momenta are not the only way to characterize the ground states

3I have to remind you that while the hydrogen energy levels are degenerate with respect to l, for other atoms this is not true because of the interaction with other electrons.

364 11 Non-interacting Many-Particle Systems

of atoms. An alternative approach is based on using eigenvectors of total orbital

angular momentum OL.tot/ D Pi OL .i/

(sum of the orbital momenta of all electrons),

total spin of all electrons OS.tot/ D Pi OS .i/

, and grand total momentum OJ D OL.tot/ C OS.tot/.

Properties of the sum of two arbitrary angular momentum operators, OJ.1/ and OJ.2/, can be figured out by generalizing the results for the sum of two spins or the spin 1=2 and the angular momentum presented in Chap. 9. The eigenvectors of the operator� OJ.1/ C OJ.2/

�2 are characterized by quantum number j, which can take values

j j1 � j2j � j � j1 C j2; (11.23)

where j1 and j2 refer to eigenvalues of � OJ.1/

�2 and

� OJ.2/ �2

, respectively. For each j,

eigenvalues of OJ.1/z C OJ.2/z are characterized by magnetic numbers Mj obeying usual inequality

ˇ̌ Mj ˇ̌ � j and related to individual magnetic numbers mj1 and mj2 of OJ.1/z

and OJ.2/z correspondingly as

Mj D mj1 C mj2 : (11.24)

While Eq. 11.24 can be easily derived, proving Eq. 11.23 is a bit more than you can chew at this stage, but you may at least verify that it agrees with the cases considered in Chap. 9: for two 1=2 spins, Eq. 11.23 gives two values for j: j D 1; 0 in agreement with Eqs. 9.54 and 9.55, and for the sum of the orbital momentum and the 1=2 spin, Eq. 11.23 yields j D l ˙ 1=2 again in agreement with Sect. 9.5.2.

The transition from the description of many-fermion states in terms of single- particle orbitals to the basis formed by eigenvectors of total orbital momentum, total spin, and grand total angular momentum raises an important issue of separate symmetry properties of many-particle orbital and spin states. Consider again for simplicity two fermions that can individually be in orbital states j 1i and j 2i and spin states j"i and j#i. In the description, where spin and orbital states are lumped together in a one single-particle orbital (this is what I did writing equations such as Eq. 11.11 or 11.13), I would have introduced four single-electron orbitals j˛ii:

j˛1i � j 1;"i I j˛2i � j 2;"i I j˛3i � j 1;#i I j˛4i � j 2;#i

and used them as a basis in a 4Š=.2Š2Š/ D six-dimensional two-fermion space. If, however, I preferred to use eigenvectors of the total spin of the two particles as a basis in the spin sector of the total spin–orbital two-particle space, separating thereby the orbital and spin states, I would have to make sure that both the former and the latter components separately possess a definite parity. Four eigenvectors of the total spin of two spin 1=2 particles, indeed, contain a symmetric triplet j1;MSi of states with total S.tot/ D 1 (see Eq. 9.54) and one antisymmetric singlet state (Eq. 9.55) with total S.tot/ D 0. Thus, if I take these states as the spin components

11.3 Pauli Principle and Periodic Table of Elements: Electronic Structure. . . 365

of the total basis of two-particle fermion states, then the symmetry of the spin component will dictate the symmetry of the orbital portion. Indeed, to make the entire two-fermion state antisymmetric, the orbital components paired with any of the symmetric two-spin state j1;MSi must itself be antisymmetric. Two available orbital states can only yield a single antisymmetric combination resulting in three basis vectors characterized by the value of total spin S.tot/ D 1:

1p 2

hˇ̌ ˇ .1/1

E ˇ̌ ˇ .2/2

E � ˇ̌ ˇ .1/2

E ˇ̌ ˇ .2/1

Ei j1;�1i

1p 2

hˇ̌ ˇ .1/1

E ˇ̌ ˇ .2/2

E � ˇ̌ ˇ .1/2

E ˇ̌ ˇ .2/1

Ei j1; 0i (11.25)

1p 2

hˇ̌ ˇ .1/1

E ˇ̌ ˇ .2/2

E � ˇ̌ ˇ .1/2

E ˇ̌ ˇ .2/1

Ei j1; 1i ;

where 1= p 2 factor ensures the normalization of the vector representing the orbital

portion of the state. The remaining total spin eigenvector corresponding to S D 0 is an antisymmetric singlet j0; 0i. Consequently, the corresponding orbital part of the two-particle state must be symmetric resulting in three additional possible states:

ˇ̌ ˇ .1/1

E ˇ̌ ˇ .2/1

E j0; 0i

ˇ̌ ˇ .1/2

E ˇ̌ ˇ .2/2

E j0; 0i (11.26)

1p 2

hˇ̌ ˇ .1/1

E ˇ̌ ˇ .2/2

E C ˇ̌ ˇ .1/2

E ˇ̌ ˇ .2/1

Ei j0; 0i :

You may notice that the first two of these states are formed by identical orbitals. This is not forbidden by the Pauli principle because the spin state of two electrons in this case is antisymmetric. This situation is often described by saying that the two electrons in the same orbital state have “opposite” spins, which is not exactly accurate. Indeed, “opposite” can refer only to the possible values of the z-component of spin, but those can have opposite values in the singlet state as well as in the triplet state with Ms D 0. Thus, it is more accurate to describe this situation as a total spin zero or a singlet state. Combining three spin-antisymmetric states, Eq. 11.26, with three antisymmetric-orbital states, Eq. 11.25, you find that the total number of basis vectors in this representation is the same (six) as in the single-particle orbital basis, confirming that this is just an alternative basis in the same vector space.

A more realistic example of a basis based on the separation of many-fermion spin and orbital states would include two particles and at least three single-particle orbital states corresponding to l D 1, m D �1; 0; 1. The total orbital angular momentum of two electrons in this case can take three values: L D 0; 1; 2 with the total number of corresponding states being 1C 3C 5 D 9 with various values of magnetic number M. To figure out the symmetry of these states, you would need to present them as a linear combination of single-particle states using the Clebsch–Gordan coefficients,

366 11 Non-interacting Many-Particle Systems

similar to what I did in Sect. 9.5.2:

jL; l1; l2;Mi D X

m1;m2

CL;l1;l2M;m1;m2 jl1m1i jl2;m2i ım2;M�m1 ; (11.27)

where Kronecker’s delta makes sure that Eq. 11.24 is respected. The particle’s exchange symmetry of the states presented by jL; l1; l2;Mi is determined by the transformation rule of the Clebsch–Gordan coefficients with respect to the transposition of indexes l1;m1 and l2;m2, which you will have to accept without proof:

CL;l1;l2M;m1;m2 D .�1/L�l1�l2 CL;l2;l1M;m2;m1 : (11.28)

Indeed, applying the exchange operator OP.1; 2/ to Eq. 11.27, you will see that its action on the right-hand side of the equation consists in the interchange of indexes l1 and l2 in the Clebsch–Gordan coefficients:

OP.1; 2/ jL; l1; l2;Mi D X

m1;m2

CL;l2;l1M;m2;m1 jl1m1i jl2;m2i ım1;M�m2 D

.�1/L�l1�l2 X

m1;m2

CL;l1;l2M;m1;m2 jl1m1i jl2;m2i ım2;M�m1

D .�1/L�l1�l2 jL; l1; l2;Mi :

In the second line of this expression, I used the transposition property of CL;l1;l2M;m1;m2 , Eq. 11.28. With this it becomes quite evident that state jL; l1; l2;Mi is symmetric with respect to the exchange of particles if L � l1 � l2 is even and is antisymmetric if L � l1 � l2 is odd. In the example when l1 D l2 D 1, which I am trying to figure out now, this rule yields that the states with L D 2 and L D 0 are symmetric and the state with L D 1 is antisymmetric. Correspondingly, the latter must be paired with a triplet spin state, while the former two must go together with the zero spin state. Since the total number of single-electron orbitals in this case is 6, the expected number of two-particle antisymmetric basis vectors is 6Š=.4Š2Š/ D 15, and if you insist I can list all of them below (I will use a simplified notation jL;Mi omitting l1 and l2):

j2;Mi j0; 0i j1;Mi j1;Msi (11.29) j0; 0i j0; 0i :

The first line in this expression contains five vectors with �2 � M � 2, the second line represents 3 � 3 D 9 vectors with both M and Ms taking three values each, and finally, the last line supplies the last 15th vector to the basis.

11.3 Pauli Principle and Periodic Table of Elements: Electronic Structure. . . 367

Finally, to complete the picture, I can rewrite these vectors in terms of the grand total momentum OJ. The first five vectors from the expression above obviously correspond to j D 2, so one can easily replace this line with vectors j2;MJi, where the first number now corresponds to the value of j. The nine vectors from the second line correspond to three values of j: j D 2; 1; 0. While this situation appears terribly similar to the case of l1 D 1 and l2 D 1 states considered previously, the significant difference is that vectors j1;Mi j1;Msi are no longer associated with just one or another particle, so that Eq. 11.28 has no relation to the symmetry properties of the resulting states j j; 1; 1;MJi with respect to the exchange of the particles. All these states are as asymmetric under operator OP.1; 2/ as states j1;Mi j1;Msi. The last line in Eq. 11.29 obviously corresponds to a single state with zero grand total angular momentum, which simply coincides with j0; 0i j0; 0i. In summary, the antisymmetric basis in terms of eigenvectors of operators OJ2,

� OL.tot/ �2

, � OS.tot/

�2 , and

OJz is formed by vectors j j;L; S;MJi:

j2; 2; 0;MJi ; j2; 1; 1;MJi ; j1; 1; 1;MJi ; j0; 1; 1; 0i ; j0; 0; 0; 0i : (11.30)

It is easy to check that this basis also consists of 5C5C3C1C1 D 15 vectors. They can be expressed as linear combinations of eigenvectors of the total orbital and total spin momenta (Eq. 11.29) with the help of Eq. 11.27 and the same Clebsch–Gordan coefficients, which can always be found on the Internet. Just to illustrate this point, let me do it for the grand total eigenvector j1; 1; 1; 0i using one of the tables of the Clebsch–Gordan coefficients that Google dug out for me in the depth of the World Wide Web:

j2; 1; 1; 0i D r 1

6 j1; 1i j1;�1i C

r 1

6 j1;�1i j1; 1i C

r 2

3 j1; 0i j1; 0i :

Values of the total orbital, spin, and grand total momentum are often used to designate the electronic structure of atoms, instead of single-electron orbitals, in the form of the so-called term symbol:

2SC1LJ : (11.31)

Here the center symbol designates the value of the total orbital momentum using the same correspondence between numerical values and letters: S;P;D;F for 0; 1; 2; 3 correspondingly similar to the single-electron orbital case but with capital rather than lowercase letters. The right subscript shows the value of the grand total momentum, and the left superscript shows the multiplicity of the respective energy configuration with respect to the total spin magnetic number Ms. For instance, using this notation, the states j1; 1; 1;MJi can be described as 3P1, while states j2; 2; 0;MJi become 1D2.

The example of two electrons and three available single-particle orbital states is more realistic than the one with only two such states, but it is still a far cry

368 11 Non-interacting Many-Particle Systems

from what people have to deal when analyzing real atoms. The system of only two electrons corresponds to helium atom, and one only needs one orbital state with l1;2 D 0 to construct an antisymmetric two-electron ground state. In terms of eigenvectors of total angular momentum and total spin, this state corresponds to L D 0, S D 0: j0; 0i j0; 0i, where the orbital component is symmetric (both electrons are in the same orbital state), and the spin component is antisymmetric (spins are in an antisymmetric singlet state). The term symbol for this state is obviously 1S0. Going from helium to lithium, you already have to deal with three electrons, with the corresponding structure in terms of single-electron orbitals shown in the first line of Table 11.1. To figure out the values of the total orbital, spin, and grand total momenta for this element, you can start with the one established for helium atom and add an additional electron to it assuming that it does not disturb the existing configuration of the two electrons in the closed shell. Since we know that this electron goes to the orbital with l D 0, the total orbital momentum remains zero, and the total spin becomes 1=2 (you add a single spin to a state with S D 0, so what else can you get?), so the grand total moment becomes J D 0 C 1=2 D 1=2, so that the term symbol for Li becomes the same as for hydrogen 2S1=2 emphasizing the periodic property of the electronic properties of the elements. For the same reason, the term symbol for the next element, beryllium, is exactly the same as the one we derived for helium (see Table 11.1). To figure out the term symbol for boron, ignore the two electrons in the first closed shell, which do not contribute anything to the total orbital or spin momenta, and focus on the three electrons in the second shell. For these three electrons, you have available two orbitals with the same orbital state, l1 D l2 D 0, and opposite spin states and an extra orbital with l3 D 1 and s3 D 1=2. The total orbital and spin momenta in this case can only be equal to L D 1 and S D 1=2, while the grand total momentum can be either J1 D 1=2 or J2 D 3=2. Thus, boron can be in one of two configurations 2P1=2 or 2P3=2, but so far we have no means of figuring out which of these two configurations have a lower energy. To answer this question, we can ask help from German physicist Friedrich Hermann Hund, who formulated a set of empiric rules determining which term symbol describes the electron configurations in atoms with lowest energy. These rules can be formulated as follows:

1. For a given configuration, a term with the largest total spin has the lowest energy. 2. Among the terms with the same multiplicity, a term with the largest total orbital

momentum has the lowest energy. 3. For the terms with the same total spin and total orbital momentum, the value of

the grand total momentum corresponding to the lowest energy is determined by the filling of the outermost shell. If the outermost shell is half-filled or less than half-filled, then the term with the lowest value of the grand total momentum has the lowest energy, but if the outermost shell is more than half-filled, the term with the largest value of the grand total momentum has the lowest energy.

In the case of boron, you have to go straight to the third of Hund’s rules because the first two do not disambiguate between the corresponding terms. Checking Table 11.1, you can see that the outermost shell for boron is the one characterized by

11.4 Exchange Energy and Other Exchange Effects 369

principal number n D 2, and the total number of single-particle orbitals on this shell is 8. Since boron has three electrons on this shell, the shell is less than half-filled, so that the third Hund’s rule tells you that the ground state configuration of boron is 2P1=2.

The case of carbon is even more interesting. Ignoring again two electrons with L D 0 and S D 0, I focus on two p-electrons with l1 D l2 D 1: Speaking of total orbital momentum and total spin, you can identify the following possible values for L and S: L D 0; 1; 2 and S D 0; 1. However, one needs to remember that the overall state, including its spin and orbital components, must be antisymmetric, so that not all combinations of L and S are possible. For instance, you already know that L D 2 orbitals are all symmetric; therefore, they can only coexist with spin singlet S D 0. The corresponding grand total momentum is J D 2, so that the respective term is 1D2. The state with total orbital momentum L D 1 is antisymmetric and, therefore, demands a symmetric triplet spin state S D 1. This combination of orbital and spin momenta can generate grand total momentum J D 2; 1; 0, so that we have the following terms: 3P2, 3P1,3P0. Finally, symmetric L D 0 state must be coupled with the spin singlet giving rise to term 1S0. In summary, I identified five possible terms consistent with the antisymmetry requirement: 1D2, 3P2, 3P1,3P0, and 1S0. Using the first two Hund’s rules, you can limit the choice of the ground state configuration to the P states, and since the number of electrons in C atom on the outer shell is only 4, it is half-filled, and the third Hund’s rule yields that the ground state configuration for carbon is 3P0. Figuring out term symbols for elements where there are more than two electrons in the incomplete subshell (orbitals with the same value of the single- particle orbital momentum), such as nitrogen (three electrons on the p-subshell), is more complex, so I give you the term symbols for the rest of the elements in the second row of the periodic table in Table 11.1 without proof for you to contemplate.

11.4 Exchange Energy and Other Exchange Effects

11.4.1 Exchange Interaction

Some of the examples discussed in the previous section have already demonstrated a weird interconnectedness between spin and orbital components of many-particle states, which has nothing to do with any kind of real spin–orbital interaction. Recall, for instance, Eqs. 11.25 and 11.26 for two-fermion states: the triplet spin state in Eq. 11.25 requires asymmetric orbital state, while the singlet spin state in Eq. 11.26 asks for the orbital states to be symmetric. In the absence of interaction between electrons, all three S D 1 states are degenerate and belong to the energy eigenvalue E1 C E2, where E1;2 are eigenvalues of the single-particle Hamiltonian corresponding to the “occupied” orbital states. At the same time, S D 0 states correspond to three different energies 2E1, 2E2, and E1 C E2, depending upon the orbital components used in their construction. The E1CE2 energy level is, therefore, fourfold degenerate, with the corresponding eigenvectors formed by symmetric and

370 11 Non-interacting Many-Particle Systems

antisymmetric combinations of the same two orbital functions j 1i and j 2i. It is important to emphasize that three of these degenerate states correspond to the total spin of the system S D 1, and the fourth one possesses total spin S D 0. An interaction between electrons, however, might lift the degeneracy, making the energy of a two-electron system dependent on its spin state even in the absence of any actual spin-dependent interactions. This is yet another fascinating evidence of the weirdness of the quantum world.

I will demonstrate this phenomenon using a simple spin-independent Coulomb interaction potential:

OV.1; 2/ D e 2

4�"0 jOr1 � Or2j which describes the repulsion between two electrons in the atom of helium and is added to an attractive potential responsible for the interaction between the electrons and the nucleus. While a mathematically rigorous solution of a quantum three- body problem is too complicated for us to handle, what I can do is to compute the expectation value of the potential OV.1; 2/ using eigenvectors of the non-interacting electrons. As you will find out later in Chap. 13, such an expectation value gives you an approximation for the interaction-induced correction to the eigenvalues of the Hamiltonian.

Let me begin with the two-fermion state described by the vector presented in Eq. 11.25, which is characterized by an antisymmetric orbital component. The interaction potential does not contain any spin-related operators, allowing me to ignore the spin component of this state (it will simply yield h1;Msj 1;Msi D 1) and write the expectation value as follows:

D OV.1; 2/ E

D 1

2

hD .2/ 2

ˇ̌ ˇ D .1/ 1

ˇ̌ ˇ �

D .2/ 1

ˇ̌ ˇ D .1/ 2

ˇ̌ ˇ i OV.1; 2/

hˇ̌ ˇ .1/1

E ˇ̌ ˇ .2/2

E � ˇ̌ ˇ .1/2

E ˇ̌ ˇ .2/1

Ei D

1

2

hD .2/ 2

ˇ̌ ˇ D .1/ 1

ˇ̌ ˇ OV.1; 2/

ˇ̌ ˇ .1/1

E ˇ̌ ˇ .2/2

E C D .2/ 1

ˇ̌ ˇ D .1/ 2

ˇ̌ ˇ OV.1; 2/

ˇ̌ ˇ .1/2

E ˇ̌ ˇ .2/1

Ei �

(11.32)

1

2

hD .2/ 1

ˇ̌ ˇ D .1/ 2

ˇ̌ ˇ OV.1; 2/

ˇ̌ ˇ .1/1

E ˇ̌ ˇ .2/2

E C D .2/ 2

ˇ̌ ˇ D .1/ 1

ˇ̌ ˇ OV.1; 2/

ˇ̌ ˇ .1/2

E ˇ̌ ˇ .2/1

Ei :

(11.33)

If you carefully compare the terms in the third and fourth lines of the expression above, you will notice a striking difference between them. In both terms in the third line (Eq. 11.32), the ket and bra vectors, describing any of the two particles,

represent the same state ( ˇ̌ ˇ .1/1

E and

D .1/ 1

ˇ̌ ˇ, ˇ̌ ˇ .2/2

E and

D .2/ 2

ˇ̌ ˇ), while the ket and bra

vectors of the same particle in the fourth line (Eq. 11.33) correspond to different

states ( ˇ̌ ˇ .1/1

E and

D .1/ 2

ˇ̌ ˇ, ˇ̌ ˇ .2/2

E and

D .2/ 1

ˇ̌ ˇ). In other words, the terms in the line

11.4 Exchange Energy and Other Exchange Effects 371

labeled as Eq. 11.32 look like regular single-particle expectation values, while the terms in the next line look like non-diagonal matrix elements computed between different states for each particle. You can also notice that the two terms in Eq. 11.32 can be transformed into the other by exchange operator OP .1; 2/. Since the particles are identical, no matrix elements must change as a result of the transposition, which means that these terms are equal to each other. If you, however, apply the exchange operator to the terms in Eq. 11.33, you will generate expressions, where ket and bra vectors are reversed, meaning that these terms are complex conjugates of each other. Finally, you can easily see that the expression in Eq. 11.32 would have exactly the same form even if the particles in question were distinguishable, while Eq. 11.33 results from the antisymmetrization requirements imposed on the two-electron state.

Taking all this into account, the interaction expectation value can be presented as

D OV.1; 2/ E

D VC C Vexc; (11.34)

where VC is defined as

VC D D .2/ 2

ˇ̌ ˇ D .1/ 1

ˇ̌ ˇ OV.1; 2/

ˇ̌ ˇ .1/1

E ˇ̌ ˇ .2/2

E

and Vexc as

Vexc D �Re hD .2/ 1

ˇ̌ ˇ D .1/ 2

ˇ̌ ˇ OV.1; 2/

ˇ̌ ˇ .1/1

E ˇ̌ ˇ .2/2

Ei :

Using the position representation for the orbital states, the expression for VC can be written down in the explicit form

VC D e 2

4�"0

ˆ d3r1

ˆ d3r2

j 1.r1/j2 j 2.r2/j2 jr1 � r2j ; (11.35)

which makes all statements made about VC rather obvious. If you agree to identify e j .r/j2 with the charge density, you can interpret Eq. 11.35 as a classical energy of the Coulomb interaction between two continuously distributed charges with densities e j 1.r/j2 and e j 2.r/j2.

The expression for Vexc in the position representation takes the form

Vexc D � e 2

4�"0 Re

ˆ d3r1

ˆ d3r2

�1 .r2/ �2 .r1/ 1.r1/ 2.r2/ jr1 � r2j

� ; (11.36)

which does not have any classical interpretation. This contribution to the energy is called exchange energy, and its origin can be directly traced to the antisymmetriza- tion requirement. The expectation value computed with the symmetric orbital state would have the same form as in Eq. 11.34, with one but important difference—a dif- ferent sign in front of the exchange energy term. Thus, previously degenerate states are now split by the interaction of the amount equal to 2Vexc on the basis of their

372 11 Non-interacting Many-Particle Systems

spin states. Just think about it—in the absence of any special spin–orbit interaction term in the Hamiltonian, the energies of the two-electron states composed of the same single-particle orbitals depend on their spin state! This is a purely quantum effect, one of the manifestations of the oddity of quantum mechanics, which has profound experimental and technological implications. However, first of all, I want you to get some feeling about the actual magnitude of this effect; for this reason, I am going to compute the Coulomb and exchange energies for a simple example of a two-electron state of the helium atom.

For concreteness (and to simplify calculations), I will presume that the orbitals participating in the construction of the two-electron state are j1; 0; 0i and j2; 0; 0i, where I used the notation for the states from Chap. 8. In the position representation, the corresponding wave functions are 1.r1/ D R10.r1/=

p 4� and 2.r2/ D

R20.r2/= p 4� , where R10 and R20 are hydrogen radial wave functions, and the factor

1= p 4� is what is left of the spherical harmonics with zero orbital momentum.

When integrating Eq. 11.35 with respect to r1, I can choose the Z-axis of the spherical coordinate system in the direction of r2, in which case the denominator in this equation can be written down as

jr1 � r2j D q

r21 C r22 � 2r1r2 cos �1:

The integral over r1 now becomes

I.r2/ D 32 4�a3B

1̂

0

dr1

�̂

0

d�1

2�ˆ

0

d'1r 2 1 sin �1

e�4r1=aBq r21 C r22 � 2r1r2 cos �1

D

64

a3B

1̂

0

dr1r 2 1e

�4r1=aB 1ˆ

�1 dx

1q r21 C r22 � 2r1r2x

;

where I substituted

R10 D 2 .2=aB/3=2 exp .�2r=aB/

(remember that Z D 2 for He). Integral over x yields 1ˆ

�1 dx

1q r21 C r22 � 2r1r2x

D 1 2r1r2

2r1r2ˆ

�2r1r2

dzq r21 C r22 C z

D

1

r1r2

q r21 C r22 C 2r1r2 �

q r21 C r22 � 2r1r2

� D r1 C r2 � jr1 � r2j

r1r2 : (11.37)

11.4 Exchange Energy and Other Exchange Effects 373

Evaluating this expression separately for r1 > r2 and r1 < r2, I find for I.r2/

I.r2/ D 128 a3Br2

r2ˆ

0

dr1r 2 1e

�4r1=aB C 128 a3B

1̂

r2

dr1r1e �4r1=aB D

4

r2

1 �

1C 2r2

aB

e�4r2=aB

� : (11.38)

Now, using

R20 D 2 1

aB

3=2 1 � r

aB

exp

� r

aB

:

I get for VC

VC D e 2

4�"0

8

4�a3B

1̂

0

dr2

�̂

0

d�2

2�ˆ

0

d'2 sin �2r 2 2I.r2/

1 � r2

aB

2 exp

�2r2

aB

D

e2

4�"0

32

a3B

1̂

0

dr2r2

1 �

1C 2r2

aB

e�4r2=aB

� 1 � r2

aB

2 exp

�2r2

aB

D

272

81

e2

4�"0aB D 3:35Ry Š 46:34 eV

where I used Eq. 8.17 with Z set to unity and notation Ry D 13:8 eV for hydrogen’s ground state (in vacuum).

Now, I will compute the exchange energy correction. Keeping the same notation I.r2/ for the first integral with respect to r1, I can present it, using expressions for the radial functions provided above, as

I2.r2/ D 8 p 2

4�a3B

1̂

0

dr1

�̂

0

d�1

2�ˆ

0

d'1r 2 1 sin �1

� 1 � r1aB

� exp

� � 3r1aB

� q

r21 C r22 � 2r1r2 cos �1 D

4 p 2

a3B

1̂

0

dr1r 2 1 exp

�3r1

aB

1 � r1

aB

1ˆ

�1 dx

1q r21 C r22 � 2r1r2x

:

Equation 11.37 for the angular integral and Mathematica © for the remaining radial integrals yield

374 11 Non-interacting Many-Particle Systems

I.r2/ D 8 p 2

a3Br2

r2ˆ

0

dr1r 2 1 exp

�3r1

aB

1 � r1

aB

C

8 p 2

a3B

1̂

r2

dr1r1 exp

�3r1

aB

1 � r1

aB

D 8

p 2

27aB

1C 3r2

aB

e�3r2=aB :

Plugging it into Eq. 11.36 and dropping the real value sign because all the functions in the integral are real, I have

Vexc D � e 2

4�"0

8 p 2

27aB 4

2

aB

3=2 1

aB

3=2 �

1̂

0

dr2r 2 2 exp

�2r2

aB

1 � r2

aB

exp

� r2

aB

1C 3r2

aB

exp

�3r2

aB

D

128

27

e2

4�"0a4B

1̂

0

dr2r 2 2 exp

�6r2

aB

1 � r2

aB

1C 3r2

aB

� �1:21 eV:

Thus, this calculation showed that a state with S D 1 has a smaller energy than a state with S D 0 by 2 � 1:21 D 2:42 eV. However, the sign of the integral in the exchange term depends on the fine details of wave functions representing single-electron orbitals and is not predestined. If single-particle orbitals of electrons were different, describing, for instance, electrons with non-zero orbital momentum in outer shells of heavier elements, or electrons in metals, the situation might have been reversed, and the singlet spin state could have a lower energy than a triplet.

This difference between energies of symmetric and antisymmetric spin states gives rise to something known as an exchange interaction between spins and plays an extremely important role in magnetic properties of materials. In particular, this “interaction,” which is simply a result of the fermion nature of electrons, is responsible for the formation of ordered spin arrangements responsible for such phenomena as ferromagnetism or antiferromagnetism.

Ferromagnets—materials with permanent magnetization—have been known since the earliest days of human civilization, but the origin of their magnetic properties remained a mystery for a very long time. André-Marie Ampère (a French physicist who lived between 1775 and 1836 and made seminal contributions to electromagnetism) proposed that magnetization is the result of the alignment of all dipole magnetic moments formed by circular electron currents of each atom in the same direction. This alignment, he believed, was due to the magnetostatic interaction between the dipoles, which made the energy of the system lowest when all dipoles point in the same direction. Unfortunately, calculations showed that the magnetostatic interaction is so weak that thermal fluctuations would destroy the ferromagnetic order even at temperatures as small as a few Kelvins. The energy of

11.4 Exchange Energy and Other Exchange Effects 375

the spin exchange interaction is much bigger (if you think that 2 eV is a small energy, you will be delighted to know that it corresponds to a temperature of more than 20,000 K). The temperature at which iron loses its ferromagnetic properties due to thermal agitation is about 1043 K, which corresponds to the exchange energy of only 80 meV, which is the real culprit behind the ordering of spin magnetic moments. In the classical picture, ordering would mean that all magnetic moments are aligned in the same direction, but describing this phenomenon quantum mechanically, we would say that N spins S are aligned if they are in the symmetric state with total spin equal to NS. For such a state to correspond to the ground state of the system of N spins, the exchange energy must favor (be lower for) symmetric states over the antisymmetric states. An antisymmetric state classically could be described as an array of magnetic moments, each pair of which is aligned in the opposite directions, while quantum mechanically we would say that each pair of spins in the array is in the spin zero state. Materials with such an arrangement of magnetic moments are known as antiferromagnetics, and for the antiferromagnetic state to be a ground state, the exchange energy must change its sign compared to the ferromagnetic case. The complete theory of magnetic order in solids is rather complicated, so you should not think that this brief glimpse into this area gives you any kind of even remotely complete picture, but beyond all these complexities, there is a main underlying physical mechanism—the exchange energy.

11.4.2 Exchange Correlations

The symmetry requirement on the many-particle states of indistinguishable particles affects not only their interaction energy but also their spatial positions. To illustrate this point, I will compute the expectation value of the distance between two electrons defined as

D .r1 � r2/2

E D ˝r21

˛C ˝r22 ˛ � 2 hr1r2i : (11.39)

This time around, I will assume that the two electrons belong to two different hydrogen-like atoms separated by a distance R small enough for the wave functions describing states of each electron to have significant spatial overlap. I will also assume that the electrons are in the same atomic orbital jn; l;mi, but since each of these orbitals belongs to two different atoms, they represent different states, even if they are described by the same set of quantum numbers. To distinguish between these orbitals, I will add another parameter to the set of quantum numbers characterizing the position of the nucleus: jn; l;m;Ri. If the atoms are separated by a significant distance, you can quite clearly ascribe an electron to an atom it belongs. However, when the distance between the nuclei becomes comparable with the characteristic size of the electron’s wave function, this identification is no longer possible, and you have to introduce two orbitals for each electron, jn; l;m;R1ii

376 11 Non-interacting Many-Particle Systems

and jn; l;m;R2ii, where lower index i outside of the ket symbol takes values 1 or 2 signifying one or another electron. This two-electron system can be again in a singlet or triplet spin state demanding symmetric or antisymmetric two-electron orbital state:

j˙i D 1p 2 Œjn; l;m;R1i1 jn; l;m;R2i2 ˙ jn; l;m;R1i2 jn; l;m;R2i1� : (11.40)

The first two terms in Eq. 11.39 are determined by single-particle orbitals:

h˙j r21 j˙i D 1

2

� 1 hn; l;m;R1j r21 jn; l;m;R1i1 C 1 hn; l;m;R2j r21 jn; l;m;R2i1 ˙

1 hn; l;m;R1j r21 jn; l;m;R2i1 � 2 hn; l;m;R1j n; l;m;R2i2 ˙ 1 hn; l;m;R2j r21 jn; l;m;R1i1 � 2 hn; l;m;R2j n; l;m;R1i2

� :

When writing this expression, I took into account that the orbitals belonging to the same atom are normalized, 2 hn; l;m;R2j n; l;m;R2i2 D 1, but orbitals belonging to different atoms are not necessarily orthogonal: 2 hn; l;m;R1j n; l;m;R2i2 ¤ 0 . Similar expression for h˙j r22 j˙i is

h˙j r22 j˙i D 1

2

� 2 hn; l;m;R1j r22 jn; l;m;R1i2 C 2 hn; l;m;R2j r22 jn; l;m;R2i2 ˙

2 hn; l;m;R1j r22 jn; l;m;R2i2 � 1 hn; l;m;R1j n; l;m;R2i1 ˙ 2 hn; l;m;R2j r22 jn; l;m;R1i2 � 1 hn; l;m;R2j n; l;m;R1i1

� :

Since both atoms are assumed to be identical, the following must be true:

1 hn; l;m;R1j r21 jn; l;m;R1i1 D 2 hn; l;m;R2j r22 jn; l;m;R2i2 � a2

1 hn; l;m;R2j r21 jn; l;m;R2i1 D 2 hn; l;m;R1j r22 jn; l;m;R1i2 � b2

1 hn; l;m;R1j r21 jn; l;m;R2i1 D 2 hn; l;m;R2j r22 jn; l;m;R1i2 � u 2 hn; l;m;R1j n; l;m;R2i2 D 1 hn; l;m;R2j n; l;m;R1i1 � v:

All these relations can be obtained by noticing that the system remains unchanged if you replace R1 ! R2 and simultaneously change electron indexes 1 and 2. Taking into account these relations and corresponding simplified notations, I can write for h˙j r21;2 j˙i:

h˙j r21 j˙i D h˙j r22 j˙i D 1

2

� a2 C b2 ˙ �uv C u � v��� :

11.4 Exchange Energy and Other Exchange Effects 377

The next step is to evaluate hr1r2i:

h˙j r1r2 j˙i D 1

2 Œ1 hn; l;m;R1j r1 jn; l;m;R1i1 � 2 hn; l;m;R2j r2 jn; l;m;R2i2 C (11.41)

1 hn; l;m;R2j r1 jn; l;m;R2i1 � 2 hn; l;m;R1j r2 jn; l;m;R1i2 ˙ 1 hn; l;m;R1j r1 jn; l;m;R2i1 � 2 hn; l;m;R1j r2 jn; l;m;R2i 2˙

1 hn; l;m;R2j r1 jn; l;m;R1i1 � 2 hn; l;m;R2j r2 jn; l;m;R1i2� : (11.42)

The evaluation of these expressions requires a more explicit determination of the point with respect to which electron position vectors are defined. Assuming for concreteness that the origin of the coordinate system is at the nucleus of atom 1, I can immediately note that the symmetry with respect to inversion kills first two terms in Eq. 11.42 since 1 hn; l;m;R1j r1 jn; l;m;R1i1 D 2 hn; l;m;R1j r2 jn; l;m;R1i2 D 0. The remaining two terms survive and can be written as

h˙j r1r2 j˙i D ˙ jdj2

where I introduced vector d defined as follows:

1 hn; l;m;R1j r1 jn; l;m;R2i1 D 2 hn; l;m;R2j r2 jn; l;m;R1i2 � d:

Finally, combining all the obtained results together, I can write

D .r1 � r2/2

E D a2 C b2 ˙

� uv C u � v� � 2 jdj2

� : (11.43)

While the actual computation of matrix elements appearing in Eq. 11.43 is rather difficult and will not be attempted here, you can still learn something from this exercise. Its main lesson is that the spin state of the electrons affects how close the electrons of the two atoms can be. Assuming for concreteness that the expression in the parentheses in Eq. 11.43 is negative, which is favored by the term jdj2 (the actual sign depends on the single-electron orbitals), one can conclude that the antisymmetric spin state promoting symmetric orbital state (C sign in ˙) results in electrons being closer together, than in the case of the symmetric spin state. This is an interesting quantum mechanical effect: electrons appear to be “pushed” closer toward each other or further away from each other depending on their spin state even though there is no actual physical force doing the “pushing.” This phenomenon plays an important role in chemical bonding between atoms, because electrons, when “pushed” toward each other, pull their nuclei along making the formation of a stable bi-atomic molecule more likely.

378 11 Non-interacting Many-Particle Systems

11.5 Fermi Energy

The behavior of systems consisting of many identical particles (and by many here I mean really huge, something like Avogadro’s number) is studied by a special field of physics called quantum statistics. Even a sketchy review of this field would take us well outside the scope of this book, but there is one problem involving a very large number of fermions, which we can handle. The issue in question is the structure of the ground state and its energy for the system on N 1 non-interacting free electrons (an ideal electron gas) confined within a box of volume V . Each electron is a free particle characterized by a momentum p, corresponding single-particle energy Ep D p2=2me, and a single particle wave function (in the position representation) p .r/ D Ap exp .ip � r=„/, where Ap is a normalization parameter, which was chosen in Sect. 5.1.1 to be 1=

p 2�„ to generate a delta-function normalized wave function.

Here it is more convenient to choose an alternative normalization, which would explicitly include volume V occupied by the electrons. To achieve this, I will impose so-called periodic boundary conditions:

p .r C L/ D p .r/ ; (11.44)

where L is a vector with components Lx; Ly; Lz such that LxLyLz D V . This boundary condition is the most popular choice in solid-state physics, and if you are wondering about its physical meaning and any kind of relation to reality, it does not really have any. The logic of using it is based upon two ideas. First, it is more convenient than, say, particle-in-the-box boundary conditions p .L/ D 0, implying that the electrons are confined in an infinite potential well, because it keeps the wave functions in the complex exponential form rather than forcing them to become much less convenient real-valued sin functions. Second, it is believed that as long as we are not interested in specific surface-related phenomena, the behavior of the wave functions at the boundary of a solid shall not have any impact on its bulk properties. I have used a similar idea when computing the total energy of electromagnetic field in Sect. 7.3.1.

This boundary condition imposes restrictions on the allowed values of the electron’s momentum:

exp .ip � .r C L/ =„/ D exp .ip � r=„/ )

exp .ip � L=„/ D 1 ) pi � Li„ D 2�ni; (11.45)

where i D x; y; z and ni D ˙1;˙2;˙3 � � � . In addition to making the spectrum of the momentum operator discrete, the periodic boundary condition also allows an alternative normalization of the wave function:

11.5 Fermi Energy 379

ˇ̌ Ap ˇ̌2

Lx=2ˆ

�Lx=2

Ly=2ˆ

�Ly=2

Lz=2ˆ

�Lz=2 e�i

p�r „ ei

p�r „ dxdydz D 1

which yields Ap D 1= p

V . The system of normalized and orthogonal single-electron wave function takes the form

n1;n2;n3 .r/ D 1p V

exp

i

2�

Lx n1x C 2�

Ly n2y C 2�

Lz n3z

� ;

while the single-electron energies form a discrete spectrum defined by

�n1n2n3 D .2�„/2 2me

n21 L2x

C n 2 2

L2y C n

2 3

L2z

! : (11.46)

This wave function generates two single-electron orbitals characterized by two different values of the spin magnetic number ms, which is perfectly suitable for generating many-particle states of the N-electron system. The ground state of this system is given by the Slater determinant formed by N single-electron orbitals with the lowest possible single-particle energies ranging from �1;1;1 to some maximum value �F corresponding to the last orbital making it into the determinant. Thus, all single-particle orbitals of electrons are divided into two groups: those that are included (occupied) into the Slater determinant for the ground state and those that are not (empty or vacant). The occupied and empty orbitals are separated by energy �F known as the Fermi energy. The Fermi energy is an important characteristic of an electron gas, which obviously depends on the number of electrons N and determines much of its ground state properties. Thus, let’s spend some time trying to figure it out.

In principle, finding �F is quite straightforward: one needs to find the total number of orbitals M.�/ with energies less than �. Then the Fermi energy is found from equation

M .�F/ D N: (11.47) However, counting the orbitals and finding M.�/ are not quite trivial because energy values defined by Eq. 11.46 are highly degenerate, and what is even worse is that there is no known analytical formula for the degree of the degeneracy as a function of energy. The problem, however, can be solved in the limit when N ! 1 and V ! 1 so that the concentration of electrons N=V remains constant. In this limit the discrete granular structure of the energy spectrum becomes negligible (the spectrum in this case is called quasi-continuous), and the function M.�/ can be determined.

You might think that I am nuts because I first introduce finite V to make the spectrum discrete and then go to the limit V ! 1 to make it continuous again. The thing is that if I had begun with the infinite volume and continuous spectrum, the only information I would have had about the number of states is that it is infinite

380 11 Non-interacting Many-Particle Systems

(the number of states of continuous spectrum is infinite for any finite interval of energies), which does not help me at all. What I am getting out of this roundabout approach is the knowledge about how the number of states turns infinite when the volume goes to infinity, and as you will see, this is exactly what we need to find the Fermi energy.

In order to find M.�/, it is convenient to visualize the states that need to be counted. This can be done by presenting each single-electron orbital graphically as points with coordinates n1; n2; n3 in a three-dimensional space defined by a regular Cartesian coordinate system with axes X, Y , and Z. Each point here represents two orbitals with different values of the spin magnetic number. Surrounding each point by little cubes with sides equal to unity, I can cover the entire three-dimensional space containing the electron’s orbitals. Since each cube has a unit volume, the total volume covered by the cubes is equal to the number of points within the covered region. Since each point represents two orbitals with opposite spins, the number of all orbitals in this region is twice the number of points.

For simplicity let me assume that all Lz D Ly D Lz � L, which allows me to rewrite Eq. 11.46 in the form

n2.�/ D n21 C n22 C n23 (11.48) where I introduced

n2.�/ D 2me�n1n2n3L 2

.2�„/2 : (11.49)

Equation 11.48 defines a sphere in the space of electron orbitals with radius n / L p �. If you allow non-integer values for numbers n1;2;3, you could say that each

point on the surface of the sphere corresponds to states with the same energy � (such surface is called isoenergetic). All points in the interior of the surface correspond to states with energies less than �, while all points in the exterior represent states with energies larger than �. Now, the number of orbitals encompassed by the surface is, as I just explained, simply equal to the volume of the corresponding region multiplied by two to account for two values of spin. Thus, I can write for the number of states with energies less than �4:

M.�/ D 24 3 �

L

2�„ 3 .2me�/

3=2 D V .2me�/ 3=2

3�2„3 : (11.50)

4If instead of periodic boundary conditions you would use the particle-in-the-box boundary conditions requiring that the wave function vanishes at the boundary of the region Lx � Ly � Lz, you would have ended up with pi D �ni=Li, where ni now can only take positive values because wave functions sin .�n1x=Lx/ sin

� �n2y=Ly

� sin .�n3z=Lz/ with positive and negative values of ni

represent the same function, while function exp i � 2� Lx

n1x C 2�Ly n2y C 2�Lz n3z �

with positive and

negative indexes represents two linearly independent states. As a result, Eq. 11.50 when used in this case would have an extra factor 1=8 reflecting the fact that only 1/8 of a sphere correspond to points with all positive coordinates.

11.5 Fermi Energy 381

Fig. 11.3 Two-dimensional version of a state counting procedure described in the texts: squares replace cubes, a circle represents a sphere, but the points are still the states specified by two instead of three integer numbers. The 2-D version is easier to process visually, but illustrates all the important points

The problem with this calculation is, of course, that the points on the surface do not necessarily correspond to integer values of n1; n2, and n3 so that this surface cuts through the little cubes in its immediate vicinity (see Fig. 11.3 representing a two-dimensional version of the described construction). As a result, some states with energies in the thin layer surrounding the spherical surface cannot be counted accurately. The number of such states is obviously proportional to the area of the enclosing sphere, which is / L2, while the number of states counted correctly is / L3; so that the relative error of the outlined procedure behaves as 1=L and approaches zero as L goes to infinity. Thus, Eq. 11.50 can be considered to be asymptotically correct in the limit L ! 1. Now you can see the value of this procedure with the initial discretization followed by passing to the quasi-continuous limit. It allowed me to establish the exact dependence of M as a function of volume V expressed by Eq. 11.50. Now I can easily find the Fermi energy by substituting Eq. 11.50 into Eq. 11.47:

V .2me�F/

3=2

3�2„3 D N )

�F D „ 2

2me

3�2N

V

2=3 : (11.51)

The most important feature of Eq. 11.51 is that the number of electrons N and the volume they occupy V , the two quantities which are supposed to go to infinity, appear in this equation only in the form of the ratio N=V which we shall obviously keep constant when passing to the limit V ! 1, N ! 1. The ratio N=V specifies the number of electrons per unit volume and is also known as the electron concentration. This is one of the most important characteristics of the electron gas.

It is important to understand that the Fermi energy is the single-electron energy of the last “occupied” single-particle orbital and is not the energy of the many-electron ground state. To find the latter I need to add energies of all occupied single-electron orbitals:

382 11 Non-interacting Many-Particle Systems

E D 2 NmaxX

n1;n2;n3

�n1;n2;n3 ; (11.52)

where Nmax is the collection of indexes n1; n2; n3 corresponding to the last occupied state and the factor of 2 accounts for the spin variable. I will compute this sum again in the limit V ! 1, N ! 1 (which, by the way, is called thermodynamic limit), and while doing so I will show you a trick of converting discrete sums in integrals. This operation is, again, possible because in the thermodynamic limit the discrete spectrum becomes quasi-continuous, and the arguments I am going to employ here are essentially the same as the one used to compute the Fermi energy but with a slightly different flavor.

So I begin. When an orbital index ni changes by one, the change of the respective component of the momentum pi can be presented as

4pi D 2�„ Li

4ni;

where 4ni D 1. With this relation in mind, I can rewrite Eq. 11.52 as

E D 2 NmaxX

n1;n2;n3

�n1;n2;n34n14n24n3 D 2 L3

.2�„/3 NmaxX

px;py;pz

�px;py;pz4px4py4pz

where I again set Lz D Ly D Lz � L (remember that 4ni D 1, so by including factors 4n14n24n3 into the original sum, I did not really change anything). Now, when L ! 1, 4ni remains equal to unity, but 4pi ! 0, so that the corresponding sum in the preceding expression turns into an integral:

E D 2 V .2�„/3

pFxˆ

�pFx

pFyˆ

�pFy

pFzˆ

�pFz

dpxdpydpz� �

px; py; pz �

(11.53)

where I changed the notation for the energy to emphasize that momentum is now not a discrete index, but a continuous variable. Since the single-particle energy � � px; py; pz

� depends only upon p2, it makes sense to compute the integral in

Eq. 11.53 using the representation of the momentum vector in spherical coordinates. Replacing the Cartesian volume element dpxdpydpz with its spherical counterpart p2 sin �d�d'dp, where � and ' are polar and azimuthal angles characterizing the direction of vector p, I can rewrite Eq. 11.53 as

E D 2 V .2�„/3

pFˆ

0

�̂

0

2�ˆ

0

� . p/ p2 sin �d�d'dp;

11.5 Fermi Energy 383

where pF is the magnitude of the momentum corresponding to the Fermi energy �F. I proceed replacing the integration variable p with another variable � according to relation p D p2me�:

E D 2V .2me/ 3=2

2 .2�„/3 4� �Fˆ

0

� p �d�: (11.54)

Before computing this integral, let me point out that it can be rewritten in the following form:

E D �Fˆ

0

�g .�/ d� (11.55)

where I introduced quantity

g .�/ D V .2me/ 3=2 p�

2�2„3 (11.56)

called density of states. This quantity, which means the number of states per unit energy interval, is an important characteristic of any many-particle system. Actually it is so important to give me an incentive to deviate from the original goal of computing the integral in Eq. 11.54 and spend more time talking about it.

To convince you that g .�/ d� can, indeed, be interpreted as a number of states with energies within the interval Œ�; � C d��, I will simply compute this quantity directly using the same state counting technique, which I used to find the Fermi energy. However, this time around I am interested in a number of states within a spherical shell with inner radius n.�/ and outer radius n.� C d�/:

n.� C d�/ D n.�/C d� .dn=d�/ D L 2�„

p 2me� C L

4�„

r 2me �

d�

where I used Eq. 11.49 for n.�/. The volume occupied by this shell is

4V D 4�n2dn D 4�n2 dn d�

d� D

4� L22me�

4�2„2 L

4�„

r 2me �

d� D V .2me/ 3=2 p�

4�2„3 d�:

Using again the fact that the volume allocated to a single point in Fig. 11.3 is equal to one and that there are two single-electron orbitals per point, the total number of states within this spherical layer is

384 11 Non-interacting Many-Particle Systems

V .2me/ 3=2 p�

2�2„3 d�

which according to Eq. 11.56 is exactly g .�/ d�. Now I can go back to Eq. 11.55 and complete the computation of the integral,

which is quite straightforward and yields

E D V .2me/ 3=2

2�2„3 �Fˆ

0

� p �d� D V .2me/

3=2

5�2„3 � 5=2 F D

3

5 N

V .2me/

3=2

3�2N .„2/3=2 ! � 5=2 F :

In the last line, I rearranged the expression for the energy to make it clear (with the help of Eq. 11.51) that the expression in the parentheses is ��3=2F and that the ground state energy of the non-interacting free electron gas can be written down as

E D 3 5

N�F: (11.57)

This expression can also be rewritten in another no less illuminating form. Substi- tuting Eq. 11.51 into Eq. 11.57, I can present energy E of the gas as a function of volume:

E D 3„ 2 � 3�2

�2=3 10me

N5=3

V2=3 :

The fact that this energy depends on the volume draws out an important point: if you try to expand (or contract) the volume occupied by the gas, its energy changes, which means that someone has to do some work to affect this change. Recalling a simple formula from introductory thermodynamics class dW D PdV , where W is work and P is pressure exerted by the gas on the walls of containing vessel, and taking into account that for the fixed number of particles, energy E depends only on volume V (no temperature), you can relate dW to �dE and determine the pressure exerted by the non-interacting electrons on the walls of the container as

P D � dE dV

D „ 2 � 3�2

�2=3 5me

N5=3

V5=3 :

Thus, even in the ground state (which, by the way, from the thermodynamic point of view corresponds to zero temperature), an electron gas exerts a pressure on the surrounding medium which depends on the concentration of the electrons. The coolest thing about this result is that unlike the case of a classical ideal gas, this pressure has nothing to do with the thermal motion of the electrons because they

11.6 Problems 385

are in the ground state, which is equivalent to their temperature being equal to zero. This pressure is a purely quantum effect solely due to the indistinguishability of the electrons and their fermion nature.

11.6 Problems

Problems for Sect. 11.1

Problem 140 Consider the following configuration of single-particle orbitals for a system of four identical fermions:

ˇ̌ ˇ˛.1/1

E ˇ̌ ˇ˛.2/2

E ˇ̌ ˇ˛.3/3

E ˇ̌ ˇ˛.4/4

E :

Applying exchange operator OP .i; j/ to all pairs of particles in this configuration, generate all possible transpositions of the particles and determine the correct signs in front of them. Write down the correct antisymmetric four-fermion state involving these single-particle orbitals.

Problem 141 Consider the system of two bosons that can be in one of four single- particle orbitals. List all possible two-boson states adhering to the symmetrization requirements.

Problem 142 Consider two non-interacting electrons in a one-dimensional har- monic oscillator potential characterized by classical frequency !.

1. Consider single-electron orbitals j˛n;msi D jni jmsi where jni is an eigenvector of the harmonic oscillator and jmsi is a spinor describing one of two possible eigenvectors of operator OSz. Using orbitals j˛n;msi, write down the Slater deter- minant for the two-electron ground state, and find the corresponding ground state energy.

2. Do the same for the first excited state(s) of this system. 3. Write the two-electron states found in Parts I and II in the position-spinor

representation. 4. Now use the eigenvectors of the total spin of the two particles to construct the

two-particle ground and first excited states. Find the relations between the two- particle states found here with those found in Parts I and II.

5. Compute the expectation value D .z1 � z2/2

E where z1;2 are coordinates of the two

electrons in the states determined above.

Problem 143 Repeat Problem 142 for two non-interacting bosons.

386 11 Non-interacting Many-Particle Systems

Problems for Sect. 11.3

Problem 144 Consider an atom of nitrogen, which has three electrons in l D 1 states.

1. Using single-particle orbitals with l D 1 and different values of orbital and spin magnetic numbers, construct all possible Slater determinants representing possible three-electron states.

2. Applying operators OS.1/z C OS.2/z C OS.3/z to all found three-particle states, figure out the possible values of the total spin in these states.

Problems for Sect. 11.4

Problem 145 Consider two identical non-interacting particles both in the ground states of their respective harmonic oscillator potentials. Particle 1 is in the potential V1 D 12m!2x21, while particle 2 is in the potential V2 D 12m!2 .x2 � d/2. 1. Assuming that particles are spin 1=2 fermions in a singlet spin state, write down

the orbital portion of the two-particle state and compute the expectation value of the two-particle Hamiltonian

OH D Op 2 1

2me C Op

2 2

2me C 1 2

m!2x21 C 1

2 m!2 .x2 � d/2

in this state. 2. Repeat the calculations assuming that the particles are in the state with total spin

S D 1. 3. The energy you found in Parts I and II depends upon the distance d between

the equilibrium points of the potential. Classically such a dependence would mean that there is a force associated with this energy and describing repulsive or attractive interaction between the two particles. In the case under consideration, there is no real interaction, and what you have found is a purely quantum effect due to symmetry requirements on the two-particle states. Still, you can describe the result in terms of the effective “force” of interaction between the particles. Find this force for both singlet and triplet spin states, and specify its character (attractive or repulsive).

Problem 146 Consider two electrons confined in a one-dimensional infinite potential well of width d and interacting with each other via potential Vint D �E0 .z1 � z2/2 where E0 is a real positive constant and z1;2 are coordinates of the electrons.

1. Construct the ground state two-electron wave function assuming that electrons are (a) in a singlet spin state and (b) in a triplet spin state.

11.6 Problems 387

2. Compute the expectation value of the interaction potential in each of these states. 3. With interaction term included, which spin configuration would have smaller

energy?

Problem for Sect. 11.5

Problem 147 Consider an ideal gas of N electrons confined in a three-dimensional harmonic oscillator potential:

OV D 1 2

me! 2 � x2 C y2 C z2� � 1

2 me!

2r2:

Find the Fermi energy of this system and the total energy of the many-electron ground state. Hint: The degeneracy degrees of the single-particle energy levels in this case can be easily found analytically, so no transition to quasi-continuous spectrum and from summation to integration is necessary.

Part III Quantum Phenomena and Methods

In this part of the book, I will introduce you into the wonderful world of actual experimentally observable quantum mechanical phenomena. Theoretical descrip- tion of each of these phenomena will require developing special technical methods, which I will present as we go along. So, let the journey begin.

Chapter 12 Resonant Tunneling

12.1 Transfer-Matrix Approach in One-Dimensional Quantum Mechanics

12.1.1 Transfer Matrix: General Formulation

In Sects. 6.2 and 6.3 of Chap. 6, I introduced one-dimensional quantum mechanical models, in which potential energy of a particle was described by a simplest piecewise constant function (or its extreme case—a delta-function), defining a single potential well or barrier. A natural extension of this model is a potential energy profile corresponding to several wells and/or barriers (or several delta- functions). In principle, one can approach the multi-barrier problem in the same way as a single well/barrier situation: divide the entire range of the coordinate into regions of constant potential energy, and use the continuity conditions for the wave function and its derivative to “stitch” the solutions from different regions. However, it is easier said than done. Each new discontinuity point adds two new unknown coefficients and correspondingly two equations. If in the case of a single barrier you had to deal with the system of four equations, a dual-barrier problem would require solving the system of eight equations, and soon even writing those equations down becomes a serious burden, and I do not even want to think about having to solve them.

Luckily, there is a better way of dealing with the ever-increasing number of the boundary conditions in problems with multiple jumps of the potential energy. In this section I will show you a convenient method of arranging the unknown amplitudes of the wave functions and relating them to each other across the point of the discontinuity.

© Springer International Publishing AG, part of Springer Nature 2018 L.I. Deych, Advanced Undergraduate Quantum Mechanics, https://doi.org/10.1007/978-3-319-71550-6_12

391

392 12 Resonant Tunneling

Let’s move forward by going back to the simplest problem of a step potential with a single discontinuity:

V.z/ D (

V0 z < 0

V1 z > 0; (12.1)

where I assigned the coordinate z D 0 (the origin of the coordinate axes) to the point where the potential makes its jump. If I were to ask a good diligent student of quantum mechanics to write down a wave function of a particle with energy E exceeding both V0 and V1, I would have most likely been presented with the following expression:

.z/ D (

A0 exp .ik0z/C B0 exp .�ik0z/ z < 0 A1 exp .ik1z/ z > 0;

(12.2)

where

k0 D p 2me .E � V0/

„

k1 D p 2me .E � V1/

„ :

This wave function would have been perfectly fine if all what I were after was just the single step-potential problem. In this section, however, I have further- reaching goals, so I need to generalize this expression allowing for a possibility to have a wave function component corresponding to the particles propagating in the negative z direction for z > 0 as well as for z < 0. If you wonder where these backward propagating particles could come from, just imagine that there might be another discontinuity in the potential somewhere down the line, at a positive value of z, which would create a flux of reflected particles propagating in the negative z direction at z > 0. To take this possibility into account, I will replace Eq. 12.2 with a wave function of a more general form

.z/ D (

A0 exp .ik0z/C B0 exp .�ik0z/ z < 0 A1 exp .ik1z/C B1 exp .�ik1z/ z > 0:

(12.3)

The continuity of the wave function and its derivative at z D 0 then yields

A0 C B0 D A1 C B1 (12.4) k0 .A0 � B0/ D k1 .A1 � B1/ : (12.5)

12.1 Transfer-Matrix Approach in One-Dimensional Quantum Mechanics 393

Quite similarly to what I already did in Sect. 6.2.1, I can rewrite these equations as

A1 D 1 2

1C k0

k1

A0 C 1

2

1 � k0

k1

B0 (12.6)

B1 D 1 2

1 � k0

k1

A0 C 1

2

1C k0

k1

B0: (12.7)

However, for the next step, I prepared for you something new. After spending some time staring at these two equations, you might divine that they can be presented in a matrix form if amplitudes A1;0 and B1;0 are arranged into a two-dimensional column vector, while the coefficients in front of A0 and B0 are arranged into a 2 � 2 matrix:

A1 B1

� D 2 4 1 2

� 1C k0k1

� 1 2

� 1 � k0k1

�

1 2

� 1 � k0k1

� 1 2

� 1C k0k1

� 3 5

A0 B0

� : (12.8)

Go ahead, perform matrix multiplication in Eq. 12.8, and convince yourself that the result is, indeed, the system of Eqs. 12.6 and 12.7. If we agree to always use an amplitude of the forward propagating component of the wave function (whatever appears in front of exp .ikiz/) as the first element in the two-dimensional column and the amplitude of the backward propagating component (the one appearing in front of exp .�ikiz/) as the second element, I can introduce notation v0;1 for the respective columns, D.1;0/ for the matrix

D.1;0/ D "

k1Ck0 2k1

k1�k0 2k1

k1�k0 2k1

k1Ck0 2k1

# ; (12.9)

and rewrite Eq. 12.8 as a compact matrix equation:

v1 D D.1;0/v0: (12.10)

Upper indexes in the notation for this matrix are supposed to be read from right to left and symbolize a transition across a boundary between potentials V0 and V1.

I will not be surprised if at this point you feel a bit disappointed and thinking: “so what, dude? This is just a fancy way of presenting what we already know.” But be patient: patience is a virtue and is usually rewarded. The real utility of the matrix notation becomes apparent only when you have to deal with potentials featuring multiple discontinuities. So, let’s get to it and assume that at some point with coordinate z D z1, the potential experiences another jump changing abruptly from V1 to V2. If asked to write the expression for the wave function in the regions between z D 0 and z D z1 and for z > z1, you would have probably written

.z/ D (

A1 exp .ik1z/C B1 exp .�ik1z/ 0 < z < z1 A2 exp .ik2z/C B2 exp .�ik2z/ z > z1

(12.11)

394 12 Resonant Tunneling

which is, of course, a perfectly reasonable and correct expression. However, if you tried to write down the continuity equations at z D z1 using this wave function and present them in a matrix form, you would have ended up with a matrix containing exponential factors like exp .˙ik1;2z1/ and which would not look at all like simple matrix D.1;0/ from Eq. 12.9. I can try to make the situation more attractive by rewriting the expression for the wave function in a form, in which arguments of the exponential functions vanish at z D z1:

.z/ D (

A.L/1 exp Œik1 .z � z1/�C B.L/1 exp Œ�ik1 .z � z1/� 0 < z < z1 A.R/1 exp Œik2 .z � z1/�C B.R/1 exp Œik2 .z � z1/� z > z1:

(12.12)

This amounts to redefining amplitudes appearing in front of the respective exponents as you will see for yourselves when doing Problem 2 in the exercise section for this chapter. Please note the change in the notations: instead of distinguishing the amplitudes by their lower indexes (1 or 2), I introduced upper indexes L and R, indicating that these coefficients describe the wave function immediately to the left or to the right of the discontinuity point, correspondingly. At the same time, the lower indexes in all coefficients are now set to be 1, implying that we are dealing with the discontinuity at point z D z1. In terms of these new coefficients, the stitching conditions take a form

A.R/1 D 1

2

1C k1

k2

A.L/1 C

1

2

1 � k1

k2

B.L/1 (12.13)

B.R/1 D 1

2

1 � k1

k2

A.L/1 C

1

2

1C k1

k2

B.L/1 ; (12.14)

which, with obvious substitutions k0 ! k1 and k1 ! k2, become identical to Eqs. 12.6 and 12.7. These equations can again be written in the matrix form

v .R/ 1 D D.2;1/v.L/1 ; (12.15)

where v.R/1 is formed by coefficients A .R/ 1 and B

.R/ 1 , v

.L/ 1 —by coefficients A

.L/ 1 and

B.L/1 , while matrix D .2;1/ is defined as

D.2;1/ D "

k2Ck1 2k2

k2�k1 2k2

k2�k1 2k2

k2Ck1 2k2

# : (12.16)

You might have noticed by now the common features of the matrices D.1;0/ and D.2;1/: (a) they both describe the transition across a boundary between two values of the potential (V0 to V1 for the former and V1 to V2 for the latter), (b) they both connect the pairs of coefficients characterizing the wave functions immediately on the left of the potential jump with those specifying the wave function immediately

12.1 Transfer-Matrix Approach in One-Dimensional Quantum Mechanics 395

on the right of the jump, and, finally, (c) they have a similar structure, recognizing which enables you to write down a matrix connecting the wave function amplitudes across a generic discontinuity as

D.iC1;i/ D " kiC1Cki

2kiC1

kiC1�ki 2kiC1

kiC1�ki 2kiC1

kiC1Cki 2kiC1

# ; (12.17)

where

ki D p 2me .E � Vi/

„ is determined by the potential to the left of the discontinuity and

kiC1 D p 2me .E � ViC1/

„ corresponds to the value of the potential to the right of it. It is also not too difficult to rewrite Eqs. 12.3 and 12.12 in a situation, when a jump of potential occurs at an arbitrary point z D zi:

.z/ D (

A.L/i exp Œiki .z � zi/�C B.L/i exp Œ�iki .z � zi/� zi�1 < z < zi A.R/i exp ŒikiC1 .z � zi/�C B.R/i exp ŒikiC1 .z � zi/� z > zi:

(12.18)

Correspondingly, Eq. 12.15 becomes

v .R/ i D D.iC1;i/v.L/i ; (12.19)

where v.R/i contains A .R/ i and B

.R/ i , while v

.L/ i contains A

.L/ i and B

.L/ i .

I hope that by now I have managed to convince you that using the suggested matrix notations does have its benefits, but I also suspect that some of you might become somewhat skeptical about the generality of this approach. You might be thinking that all these formulas that I so confidently presented here can only be valid for energies exceeding all potentials Vi and that this fact strongly limits the utility of the method. If this did occur to you, accept my commendation for paying attention, but reality is not as bad as it appears. Let’s see what happens if one of Vi turns out to be larger than E. Obviously, in this case the respective ki becomes imaginary and can be written as

ki D p 2me .E � Vi/

„ D i p 2me .Vi � E/

„ � i i; (12.20)

396 12 Resonant Tunneling

where I introduced a new real-valued parameter

i D p 2me .Vi � E/

„ : (12.21)

The corresponding wave function at z < zi becomes

.z/ D A.L/i exp Œ� i .z � zi/�C B.L/i exp Œ i .z � zi/� :

The continuity condition of the wave function at z D zi remains the same as Eq. 12.4:

A.L/i C B.L/i D A.R/i C B.R/i ;

while the continuity of the derivative of the wave function yields this instead of Eq. 12.5

� i �

A.L/i � B.L/i �

D ikiC1 �

A.R/i � B.R/i � ;

where I assumed for the sake of argument that E > ViC1. Combining these two equations, I get, instead of Eqs. 12.6 and 12.7,

A.R/i D 1

2 A.L/i

1 � i

ikiC1

C 1 2

B.L/i

1C i

ikiC1

B.R/i D 1

2 A.L/i

1C i

ikiC1

C 1 2

B.L/i

1 � i

ikiC1

;

which can be written again in the form of Eq. 12.19 with a new matrix

QD.iC1;i/ D " ikiC1� i

2ikiC1

ikiC1C i 2ikiC1

ikiC1C i 2ikiC1

ikiC1� i 2ikiC1

# D " kiC1Ci i

2kiC1

kiC1�i i 2kiC1

kiC1� i 2kiC1

kiC1Ci i 2kiC1

# :

Comparing QD.iC1;i/ with D.iC1;i/ in Eq. 12.17, you can immediately see that the latter can be obtained from the former with the simple substitution defined in Eq. 12.20. Therefore, you do not really have to worry about the relation between energy and the respective value of the potential: Eq. 12.17 works in all cases, and if ki turns out to be imaginary, you just need to replace it with i i as prescribed by Eq. 12.20 (or

just let the computer to do it for you). Consequently, matrix QD.iC1;i/ turns out to be perfectly unnecessary and will not be used any more, but there is one circumstance which you must pay close attention to. Special significance of Eq. 12.20 is that when ki turns imaginary, it forces it to have a positive imaginary part (square root obviously allows for either positive or negative signs). As a result, the exponential factor at the amplitude designated as Ai acquires a negative positive argument, while the exponential factor multiplied by amplitude Bi gets a real positive argument. You

12.1 Transfer-Matrix Approach in One-Dimensional Quantum Mechanics 397

need to take this into account when designating corresponding amplitudes as A or B and placing them in the first or the second row of your column vector. (Obviously, it is not the actual symbols used to designate the amplitudes that are important, but their places in the column vector.)

I hope that your head is not spinning yet, but as a prophylactic measure, let me summarize what we have achieved so far. We are considering a particle moving in a piecewise constant potential, which has interruptions of continuity at a number of points with coordinates z D zi (the first discontinuity occurs at z0 D 0). When crossing zi, the potential jumps from Vi to ViC1. In the vicinity of each discontinuity point, the wave function is presented by Eq. 12.18, organized in such a way that coefficients with upper index L determine amplitudes of the right- and left-propagating components of the wave function on the left of the discontinuity and coefficients with upper index R determine the same amplitudes on the right of zi. The connection between these pairs of coefficients is described by the matrix equation as presented by Eq. 12.19.

To help you get a better feeling of why this matrix representation is useful, let me put together the matrix equations for a few successive discontinuity points:

v .R/ 2 D D.3;2/v.L/2 I v.R/1 D D.2;1/v.L/1 ; v.R/0 D D.2;1/v.L/0 (12.22)

The structure of these equations indicates that it might be possible to relate column vector v.R/2 to v

.L/ 0 by consecutive matrix multiplication if we had matrices relating

v .L/ 2 to v

.R/ 1 , v

.L/ 1 to v

.R/ 0 , and, in general, v

.L/ i to v

.R/ i�1. To find these matrices, I have

to take you back to Eq. 12.18, where you shall notice that the pairs of coefficients A.R/i�1;B

.R/ i�1 and A

.L/ i ;B

.L/ i describe the wave function defined on the same interval

zi < z < ziC1. Accordingly, the following must be true:

A.L/i exp Œiki .z � zi/�C B.L/i exp Œ�iki .z � zi/� D A.R/i�1 exp Œiki .z � zi�1/�C B.R/i�1 exp Œ�iki .z � zi�1/�

which is satisfied if

A.L/i exp Œiki .z � zi/� D A.R/i�1 exp Œiki .z � zi�1/�

and

B.L/i exp Œ�iki .z � zi/� D B.R/i�1 exp Œ�iki .z � zi�1/�

Canceling the common factor exp .ikiz/, you find

A.L/i D exp Œiki .zi � zi�1/�A.R/i�1 (12.23) B.L/i D exp Œ�iki .zi � zi�1/�B.R/i�1 (12.24)

398 12 Resonant Tunneling

which can be presented in the matrix form as

" A.L/i B.L/i

# D

exp Œiki .zi � zi�1/� 0 0 exp Œ�iki .zi � zi�1/�

�" A.R/i�1 B.R/i�1

# : (12.25)

Introducing the diagonal matrix

M.i/ D

exp Œiki .zi � zi�1/� 0 0 exp Œ�iki .zi � zi�1/�

� (12.26)

I can give Eq. 12.25 the form

v .L/ i D M.i/v.R/i�1; (12.27)

which you can recognize as the missing relation between v.L/i and v .R/ i�1. Note that

the upper index in M.i/ signifies that it corresponds to the region of coordinates zi�1 < z < zi, where the potential is equal to Vi. It is important to note that Eq. 12.26 can be used even if ki turns out to be imaginary. All you will need to do in this case is to replace ki with i i according to Eqs. 12.20 and 12.21. Now, complimenting Eq. 12.22 with the missing links, you get

v .R/ 2 D D.3;2/v.L/2 I v.L/2 D M.2/v.R/1 I v.R/1 D D.2;1/v.L/1 ; (12.28)

v .L/ 1 D M.1/v.R/0 I v.R/0 D D.2;1/v.L/0 ;

which, after combining all successive matrix relations, yields

v .R/ 2 D D.3;2/M.2/D.2;1/M.1/D.1;0/v.L/0 : (12.29)

This result illuminates the power of the method, which is presented here: the ampli- tudes of the wave function after the particles have encountered three discontinuity points of the potential are expressed in terms of the amplitudes specifying the wave function in the region before the first discontinuity via a simple matrix relation, v .R/ 2 D T.3/v.L/0 , where matrix T.3/, called the transfer matrix, is the product of five

matrices of two different kinds:

T.3/ D D.3;2/M.2/D.2;1/M.1/D.1;0/:

Matrices D.iC1;i/ can be called interface matrices as they describe transformation of the wave function amplitudes due to crossing of an interface between two distinct values of the potential, and you can use the name “free propagation matrices” for M.i/ because they describe the evolution of the wave function due to free propagation of the particle between two discontinuities. Equation 12.29 has a simple physical interpretation if you read it from right to left: a particle begins

12.1 Transfer-Matrix Approach in One-Dimensional Quantum Mechanics 399

Fig. 12.1 A potential profile corresponding to Eq. 12.29

with a wave function characterized by column vector v0. It encounters the first discontinuity at z D z1, and upon crossing it the wave function coefficients undergo transformation prescribed by matrix D.1;0/. After that the wave function evolves as it were for a free particle in potential V1—this evolution is described by the propagation matrix M.1/. The crossing of the boundary between V1 and V2 regions is represented by the interface matrix D.2;1/ and so on and so forth. One of the possible potential profiles that could have been described by Eq. 12.29 is shown in Fig. 12.1.

Equation 12.29 is trivially generalized to the case of an arbitrary number, N, of the discontinuities, located at points zi; i D 0; 1; 2 � � � N � 1 with z0 D 0:

v .R/ N�1 D T.N/v.L/0 (12.30)

with a corresponding transfer matrix defined as

T.N/ D D.N;N�1/M.N�1/ � � � D.2;1/M.1/D.1;0/: (12.31)

Once the transfer matrix is known, you can use it to obtain all the information about wave functions (and corresponding energy eigenvalues when appropriate) of the particle in the corresponding potential both in the continuous and discrete segments of the energy spectrum. The next section in this chapter discusses how this can be done.

12.1.2 Application of Transfer-Matrix Formalism to Generic Scattering and Bound State Problems

12.1.2.1 Generic Scattering Problem via the Transfer Matrix

Having defined a generic transfer matrix T.N/, I can now solve a typical scattering problem similar to the one discussed in Sect. 6.2.1. Setting it up amounts to specifying the wave function of the particle at z < 0 (before the particle encounters

400 12 Resonant Tunneling

the first break of the continuity) and at z > zN�1 (after the particle passes through the last discontinuity point). The scattering wave function introduced in Sect. 6.2.1

.z/ D (

exp .ik0z/C r exp .�ik0z/ ; z < 0 t exp .ikNz/ z > zN�1

(12.32)

is in the transfer-matrix formalism described by column vectors v0 and vN :

v .L/ 0 D

1

r

� I v.R/N�1 D

t 0

� (12.33)

Presenting the generic T-matrix by its (presumably known) elements

T.N/ D

t11 t12 t21 t22

�

I can rewrite Eq. 12.30 in the expanded form as

t 0

� D

t11 t12 t21 t22

� 1

r

� :

This translates into the system of linear equations:

t D t11 C rt12 0 D t21 C rt22:

From the second of these equations, I immediately have

r D � t21 t22 ; (12.34)

and substituting this result into the first one, I find

t D t11 � t12t21 t22

D det

� T.N/

�

t22 : (12.35)

Here det �

T.N/ �

� t11t22 � t12t21 is the determinant of the T-matrix T.N/, which, believe it or not, can actually be quite easily computed for the most general transfer matrix defined in Eq. 12.31.

To do so you must, first, recall that the determinant of the product of the matrices is equal to the product of the determinants of the individual factors:

12.1 Transfer-Matrix Approach in One-Dimensional Quantum Mechanics 401

det �

T.N/ �

D det �

D.N;N�1/ �

det �

M.N�1/ �

� � � �

det �

D.2;1/ �

det �

M.1/ �

det �

D.1;0/ � : (12.36)

It is easy to see that det �

M.i/ �

D 1 for any i, so all these factors can be omitted from Eq. 12.36 yielding

det �

T.N/ �

D det �

D.N;N�1/ �

det �

D.N�1;N�2/ �

� � � �

det �

D.2;1/ �

det �

D.1;0/ � : (12.37)

Now all I need is to compute the determinant of the generic matrix D.iC1;i/. Using Eq. 12.17, I find

det �

D.iC1;i/ �

D

kiC1 C ki 2kiC1

2 �

kiC1 � ki 2kiC1

2 D ki

kiC1 ;

which leads to the following expression for det �

T.N/ �

:

det �

T.N/ �

D kN�1 kN

kN�2 kN�1

� � � k1 k2

k0 k1

D k0 kN

(12.38)

Isn’t it amazing how all ki in the intermediate regions got canceled, so that the determinant depends only upon the wave numbers (real or imaginary) in the first and the last region of the constant potential. Using this result in Eq. 12.35, I can find a simplified expression for the transmission amplitude

t D k0 kN

1

t22 (12.39)

which becomes even simpler if the potential for z < 0 and for z > zN�1 is the same. In this case the determinant of the transfer matrix becomes equal to unity and t D 1=t22. Having found r and t, I can restore the wave function in the entire range of the coordinate by consequently applying interface and propagation matrices constituting the total transfer matrix T.N/.

With help of Eq. 6.53 from Sect. 6.2.1, I can also find the corresponding reflection and transmission probabilities:

R D jrj2 D ˇ̌ ˇ̌ t21 t22

ˇ̌ ˇ̌ 2

T D k 2 N

k20 jtj2 D

ˇ̌ ˇ̌ 1 t22

ˇ̌ ˇ̌ 2

402 12 Resonant Tunneling

Fig. 12.2 An example of a potential with discrete spectrum

where I used Eq. 12.39 for t. Since reflection and transmission probabilities must obey the condition R C T D 1, it imposes the following general condition on the elements of the transfer matrix:

jt22j2 � jt12j2 D 1:

12.1.2.2 Finding Bound States with the Transfer Matrix

Now let me show how transfer-matrix method can be used to find energies of the bound states, if they are allowed by the potential. Consider, for instance, a potential shown in Fig. 12.2. After eyeballing this figure for a few moments and recalling that discrete energy levels correspond to classically bound motion, you shall be able to conclude that states with energies in the interval V3 < E < V4 must belong to the discrete spectrum. An important general point to make here is that the discrete spectrum in such a Hamiltonian exists at energies, which are smaller than the limiting values of the potential at z ! ˙1 and larger than the potential’s smallest value, provided that these conditions are not self-contradictory. For such values of energy, the solutions of the Schrödinger equation for z < 0 and z > zN�1 (classically forbidden regions) take the form of real exponential functions, so that instead of Eq. 12.32, I have

.z/ D (

B0 exp . 0z/ z < 0

AN exp .� Nz/ z > zN�1; (12.40)

where

0 D p 2m.V0 � E/

„

N D p 2m.VN � E/

„

12.1 Transfer-Matrix Approach in One-Dimensional Quantum Mechanics 403

Before continuing I have to reiterate a point that I already made earlier in this section. Equation 12.40 was obtained by making transition from parameters k0 and kN , which become imaginary for the chosen values of energy, to real parameters 0 and N with the help of Eq. 12.20. This procedure turns exponential functions exp .˙ikz/ into exp . z/. Accordingly, in order to preserve the structure of my transfer matrices, I have to designate amplitude coefficients in front of exp . iz/ as Bi and coefficients in front of exp .� iz/ as Ai. Finally, I feel obliged to remind you that I discarded exponentially growing terms in Eq. 12.40 in order to preserve normalizability of the wave function. Thus, now, initial vectors v0 and vN , instead of Eq. 12.33, take the form

v .L/ 0 D

0

B0

� I v.R/N�1 D

AN 0

�

The resulting transfer-matrix equation in this case becomes

AN 0

� D

t11 t12 t21 t22

� 0

B0

�

which yields

AN D t12B0 0 D t22B0

The last of these equations produces an equation for the allowed energy values, since it can only be fulfilled for nonvanishing B0 and A0 if

t22.E/ D 0 (12.41)

The first of these equations express AN in terms of the remaining undetermined coefficient B0, which can be fixed by the normalization requirement.

12.1.3 Application of the Transfer Matrix to a Symmetrical Potential Well

To illustrate the transfer-matrix method, I will now apply it to a problem, which we have already solved in Sect. 6.2.1—the states of a particle in a symmetric rectangular potential well. To facilitate application of the transfer-matrix approach, I will describe this potential by function

404 12 Resonant Tunneling

V.z/ D

8̂ <̂ ˆ̂:

Vb z < 0

Vw 0 < z < d

Vb z > d;

(12.42)

which differs from the one used in Sect. 6.2.1 by the choice of the origin of the coordinate axis for z. This potential has two discontinuity points: it changes from Vb to Vw at z0 D 0 and then, again, from Vw to Vb at z1 D d, where it is assumed that Vb > Vw. Correspondingly, I need to introduce two interface matrices: D.1;0/ as defined in Eq. 12.9 with k0 D

p 2me .E � Vb/ and k1 D

p 2me .E � Vw/ and D.2;1/

defined in Eq. 12.16 with k2 D k0.

12.1.3.1 Scattering States

Scattering states (continuous spectrum) of this potential correspond to energies E > Vb, in which case parameters k0 and k1 are regular real-valued wave numbers. Inserting the free propagation matrix M.1/ from Eq. 12.26 between D.2;1/ and D.1;0/

according to Eq. 12.31 and taking into account that z0 D 0 and z1 D d; I obtain the total T-matrix

T.2/ D D.2;1/M.1/D.1;0/ D "

k0Ck1 2k0

k0�k1 2k0

k0�k1 2k0

k0Ck1 2k0

# exp .ik1d/ 0

0 exp .�ik1d/ �" k1Ck0

2k1 k1�k0 2k1

k1�k0 2k1

k1Ck0 2k1

# D

" k0Ck1 2k0

exp .ik1d/ k0�k1 2k0

exp .�ik1d/ k0�k1 2k0

exp .ik1d/ k0Ck1 2k0

exp .�ik1d/

#" k1Ck0 2k1

k1�k0 2k1

k1�k0 2k1

k1Ck0 2k1

# D

2 4

.k0Ck1/2 exp.ik1d/�.k0�k1/2 exp.�ik1d/ 4k0k1

.k21�k20/Œexp.ik1d/�exp.�ik1d/� 4k0k1

� .k21�k20/Œexp.ik1d/�exp.�ik1d/� 4k0k1

.k0Ck1/2 exp.�ik1d/�.k0�k1/2 exp.ik1d/ 4k0k1

3 5 D

1

2k0k1 �

i � k20 C k21

� sin k1d C 2k0k1 cos k1d i

� k21 � k20

� sin k1d

�i �k21 � k20 �

sin k1d �i � k20 C k21

� sin k1d C 2k0k1 cos k1d

� :

(12.43)

Substitution of the corresponding elements of the T-matrix from the last expression into Eqs. 12.34 and 12.39 yields the amplitude reflection and transmission coeffi- cients:

12.1 Transfer-Matrix Approach in One-Dimensional Quantum Mechanics 405

r D i � k21 � k20

� sin k1d

�i �k20 C k21 �

sin k1d C 2k0k1 cos k1d D (12.44)

� k20 � k21

� sin k1d�

k20 C k21 �

sin k1d C 2ik0k1 cos k1d :

t D 2k0k1�i �k20 C k21 �

sin k1d C 2k0k1 cos k1d D (12.45)

2ik0k1� k20 C k21

� sin k1d C 2ik0k1 cos k1d

:

where at the last steps, the numerators and denominators of the expressions for r and t were multiplied by i. The resulting expressions coincide with Eqs. 6.58 and 12.35 of Sect. 6.2, which, of course, is not surprising. Having found the reflection and transmission amplitudes, I can easily restore the entire wave function. Indeed, substitution of Eqs. 12.45 and 12.44 into Eq. 12.32 yields the wave function for z < 0 and z > d. Next, using Eq. 12.10 with v0 in the form

v0 D 1

r

�

I find coefficients A.R/0 and B .R/ 0 :

" A.R/0 B.R/0

# D "

k1Ck0 2k1

k1�k0 2k1

k1�k0 2k1

k1Ck0 2k1

# 1

r

� )

A.R/0 D k1 C k0 C r .k1 � k0/

2k1 (12.46)

B.R/0 D k1 � k0 C r .k1 C k0/

2k1 ; (12.47)

which generate the wave function in the region 0 < z < d:

.z/ D k1 .1C r/C k0 .1 � r/ 2k1

eik1z C k1 .1C r/ � k0 .1 � r/ 2k1

e�ik1z:

I will leave it as an exercise to demonstrate that coefficients A.R/0 and B .R/ 0 in

Eqs. 12.46 and 12.47 coincide with coefficients A2 and B2 in Eqs. 6.60 and 6.61 in Sect. 6.2. Rewriting the expression for the wave function as

.z/ D k1 .1C r/C k0 .1 � r/ 2k1

eik1deik1.z�d/C

k1 .1C r/ � k0 .1 � r/ 2k1

e�ik1de�ik1.z�d/;

406 12 Resonant Tunneling

where I simply multiplied each term by exp .ik1d/ exp .�ik1d/ � 1, you can identify coefficients A.L/1 and B

.L/ 1 as

A.L/1 D k1 C k0 C r .k1 � k0/

2k1 eik1d

B.L/1 D k1 � k0 C r .k1 C k0/

2k1 e�ik1d:

The same expressions for A.L/1 and B .L/ 1 can obviously be found by multiplying

diagonal matrix M.1/ by v.R/0 formed by coefficients A .R/ 0 and B

.R/ 0 . Finally, in order

to convince the skeptics that the outlined procedure is self-consistent, you can try to apply the interface matrix D.1;2/ to A.L/1 and B

.L/ 1 :

" A.R/2 B.R/2

# D "

k0Ck1 2k0

k0�k1 2k0

k0�k1 2k0

k0Ck1 2k0

#" k1Ck0Cr.k1�k0/

2k1 eik1d

k1�k0Cr.k1Ck0/ 2k1

e�ik1d

# (12.48)

yielding for A.R/2

A.R/2 D .k0 C k1/2 4k0k1

eik1d C r k 2 1 � k20 4k0k1

eik1d�

.k0 � k1/2 4k0k1

e�ik1d � r k 2 1 � k20 4k0k1

e�ik1d D

eik1d

k0.1 � r/ 4k1

C k1.1C r/ 4k0

C 1 2

�

e�ik1d

k0.1 � r/ 4k1

C k1.1C r/ 4k0

� 1 2

:

To continue I have to use the reflection coefficient r given by Eq. 12.44. Evaluating parts of the expression for A.R/2 separately, I find

k0.1 � r/ 4k1

D k0 4k1

" 1 �

� k20 � k21

� sin k1d�

k20 C k21 �

sin k1d C 2ik0k1 cos k1d

# D

k0 2

k1 sin k1d C ik0 cos k1d� k20 C k21

� sin k1d C 2ik0k1 cos k1d

k1.1C r/ 4k0

D k1 4k0

" 1C

� k20 � k21

� sin k1d�

k20 C k21 �

sin k1d C 2ik0k1 cos k1d

# D

k1 2

k0 sin k1d C ik1 cos k1d� k20 C k21

� sin k1d C 2ik0k1 cos k1d

:

12.1 Transfer-Matrix Approach in One-Dimensional Quantum Mechanics 407

Lastly,

k0.1 � r/ 4k1

C k1.1C r/ 4k0

C 1 2

D

1

2

" 2k0k1 sin k1d C i

� k20 C k21

� cos k1d�

k20 C k21 �

sin k1d C 2ik0k1 cos k1d C 1

# D

i

2

.k0 C k1/2 e�ik1d� k20 C k21

� sin k1d C 2ik0k1 cos k1d

;

where at the last step, I replaced sin k1d C i cos k1d with i exp .�ik1d/. Similarly, k0.1 � r/ 4k1

C k1.1C r/ 4k0

� 1 2

D

1

2

" 2k0k1 sin k1d C i

� k20 C k21

� cos k1d�

k20 C k21 �

sin k1d C 2ik0k1 cos k1d � 1

# D

i

2

.k0 � k1/2 eik1d� k20 C k21

� sin k1d C 2ik0k1 cos k1d

:

Combining all these results, I finally get A.R/2 :

A.R/2 D i

2

.k0 C k1/2 � .k0 � k1/2� k20 C k21

� sin k1d C 2ik0k1 cos k1d

D

2ik0k1� k20 C k21

� sin k1d C 2ik0k1 cos k1d

: (12.49)

Catching my breath after this marathon calculations (OK—half marathon), I am eager to compare Eq. 12.49 with Eq. 12.45 for the transmission amplitude. With a sigh of relief, I find that they, indeed, coincide. I will leave it as an exercise to demonstrate that B.R/2 vanishes as it should.

12.1.3.2 Bound States

Now I will illustrate application of the transfer-matrix approach to bound states of the square potential well described by the same by Eq. 12.42. Discrete spectrum of this potential is expected to exist in the interval of energies defined as Vw < E < Vb. The transfer matrix given in Eq. 12.43 can be adapted to this case by replacing wave number k0 with i 0, where 0 in this context is defined as

0 D p 2m.Vb � E/

„

408 12 Resonant Tunneling

This procedure yields

T D 1 2 0k1

� �� 20 C k21

� sin k1d C 2 0k1 cos k1d

� k21 C 20

� sin k1d

� �k21 C 20 �

sin k1d � �� 20 C k21

� sin k1d C 2 0k1 cos k1d

�

and Eq. 12.41 for the bound state energies takes the following form:

2 0k1 cos k1d D �� 20 C k21

� sin k1d

or

tan .k1d/ D 2 0k1� 20 C k21 (12.50)

At the first glance, this result does not agree with the one I derived in Sect. 6.2.1, where states were segregated according to their parity with different equations for the energy levels of the even and odd states. Equation 12.50, on the other hand, is a single equation, and the parity of the states has never been even mentioned. If, however, you pause to think about it, you will see that the differences between results obtained here and in Sect. 6.2.1 are purely superficial.

First of all, you need to notice that the coordinates used here and in Sect. 6.2.1 have different origins. Placing the origin of the coordinate at the center of the well made the inversion symmetry of the potential with respect to its center reflected in its coordinate dependence. Consequently, we were able to classify states by their parity. This immediate benefit of the symmetry is lost once the origin of the coordinate is displaced from the center of the well. This, of course, did not change the underlying symmetry of the potential (it has nothing to do with such artificial things as our choice of the coordinate system), but it masked it. The wave functions written in the coordinate system centered at the edge of the potential well do not have a definite parity with respect to point z D 0, and it is not surprising that my derivation of the eigenvalue equation naturally yielded a single equation for all energy eigenvalues. However, it is not too difficult to demonstrate that our single Equation 12.50 is in reality equivalent to two equations of Sect. 6.2.1, but it does take some extra efforts.

First, you shall notice that trigonometric functions in Eqs. 6.39 and 6.42 are expressed in terms of kd=2, while Eq. 12.50 contains tan .k1d/. Thus, it makes sense to try to express tan .k1d/ in terms of k1d=2 using a well-known identity

tan .k1d/ D 2 tan .k1d=2/ 1 � tan2 .k1d=2/ ;

12.1 Transfer-Matrix Approach in One-Dimensional Quantum Mechanics 409

which yields

tan .k1d=2/

1 � tan2 .k1d=2/ D 0k1

� 20 C k21 :

To simplify algebra, it is useful to temporarily introduce notations x D tan .k1d=2/, � D �� 20 C k21

� = 0k1, and rewrite the preceding equation as a quadratic equation

for x:

x2 C x� � 1 D 0

This equation has two solutions:

x1;2 D �1 2 � ˙ 1

2

p �2 C 4

Computing �2 C 4 you will easily find that

�2 C 4 D k 4 1 C 40 � 2k21 20

k21 2 0

C 4 D � 20 C k21

�2 k21

2 0

which yield the following for x1 and x2:

x1 D �� 2 0 C k21 2 0k1

� 2 0 C k21 2k1 0

D � k1 0

x2 D �� 2 0 C k21 2 0k1

C 2 0 C k21 2k1 0

D 0 k1

Recalling what x stands for, you can see that one equation 12.50 is now replaced by two equations:

tan .k1d=2/ D � k1 0

(12.51)

tan .k1d=2/ D 0 k1 : (12.52)

which are exactly the eigenvalue equations for odd and even wave functions derived in Sect. 6.2.1. Isn’t it beautiful, really?

Having figured out the situation with the eigenvalues, I can take care of the eigenvectors. The ratio of the wave function amplitudes A2=B0 is given by

A2 B0

D t12 D � k21 C 20

� sin k1d

2 0k1 ; (12.53)

410 12 Resonant Tunneling

while amplitudes of the wave functions in the region 0 < z < d are found from

A1 B1

� D D.1;0/

0

B0

� :

Matrix D.1;0/ is adapted to the case under consideration by the same substitution k0 ! i 0 as before:

D.1;0/ D "

k1Ci 0 2k1

k1�i 0 2k1

k1�i 0 2k1

k1Ci 0 2k1

# :

Using this matrix, you easily find

A1 D k1 � i 0 2k1

B0

B1 D k1 C i 0 2k1

B0;

which yields the following expression for the wave function inside the well:

.z/ D B0

k1 � i 0 2k1

exp .ik1z/C k1 C i 0 2k1

exp .�ik1z/

D

B0

cos k1z C 0

k1 sin k1z

:

You can replace the ratio 0=k1 in this expression with tan .k1d=2/ or with � cot .k1d=2/ according to Eqs. 12.51 and 12.52 and obtain the following expres- sions for the wave function representing two different types of states:

.z/ D (

B0 cos.k1d=2/

cos Œk1 .z � d=2/� ; 0=k1 D tan .k1d=2/ � B0sin.k1d=2/ sin Œk1 .z � d=2/� ; 0=k1 D � cot .k1d=2/

It is quite obvious now that the found wave functions are even and odd with respect to variable Qz D z � d=2, which is merely a coordinate defined in the coordinate system with the origin at the center of the well, just like in Sect. 6.2.1. One can also show that Eq. 12.53 is reduced to A2 D ˙B0 for two different types of the wave function, again in agreement with the results of Sect. 6.2.1. This proof I will leave to you as an exercise.

12.2 Resonant Tunneling 411

12.2 Resonant Tunneling

In this section I will apply the transfer-matrix method to describe an interesting and practically important phenomenon of resonant tunneling. This phenomenon arises when one considers quantum states of a particle in a potential, which consists of two (or more) potential barriers separated by a potential well. An example of such a potential is shown in Fig. 12.3. I am interested here in the states corresponding to under-barrier values of energies E: 0 < E < V . It was established in Sect. 6.2.1 that in the case of a single barrier whose width d satisfies inequality d 1, where D p2me .V � E//=„ , such states are characterized by an exponentially small transmission probability T / exp .� d/, which is responsible for the effect of quantum tunneling—a particle incident on the barrier has a non-zero probability to “tunnel” through it and continue its free propagation on the other side of the barrier. You might wonder if adding a second barrier will result in any new and interesting effects. A common sense based on “classical” probability theory suggests that in the presence of the second barrier, the total transmission probability will simply be a product of transmission coefficients for each of the barriers T / T1T2 / exp .� 1d1 � 2d2/, further reducing the probability that the particle tunnels through the barriers. However, as it often happens, the reality is more complex (and sometimes more intriguing) than our initial intuited insight. So, let’s see if our intuition leads us astray in this case.

To simplify algebra, I will assume that both barriers have the same width d and height V and that they are separated by a region of zero potential of length w. This potential profile is characterized by four discontinuity points with coordinates

x0 D 0I x1 D dI x2 D d C wI x3 D 2d C w: (12.54)

Accordingly, the propagation of a particle through this potential is described by four interface matrices, D.1;0/, D.2;1/, D.3;2/, and D.4;3/, and three free propagating matrices M.1/, M.2/, and M.3/. Matrices D.1;0/ and D.2;1/ are obviously identical to matrices D.3;2/ and D.4;3/, correspondingly, and can be obtained from those appearing in the first line of Eq. 12.43 by replacing k0 ! k D

p 2meE=„ and

k1 ! i D i p 2me .V � E//=„:

D.1;0/ D D.3;2/ D i Ck 2i

i �k 2i

i �k 2i

i Ck 2i

� I (12.55)

Fig. 12.3 Double-barrier potential

412 12 Resonant Tunneling

D.2;1/ D D.4;3/ D kCi

2k k�i 2k

k�i 2k

kCi 2k

� : (12.56)

For matrices M.1/; M.2/, and M.3/, I can write, using general definition, Eq. 12.26 and expressions for the corresponding coordinates given in Eq. 12.54:

M.1/ D M.3/ D

exp .� d/ 0 0 exp . d/

� (12.57)

M.2/ D

exp .ikw/ 0 0 exp .�ikw/

� : (12.58)

The total transfer matrix T then becomes

T.4/ D D.4;3/M.3/D.3;2/M.2/D.2;1/M.1/D.1;0/ D D.2;1/M.1/D.1;0/M.2/D.2;1/M.1/D.1;0/ � T.2/M.2/T.2/; (12.59)

where T.2/ is the transfer matrix describing the single barrier. I do not have to calculate this matrix from scratch. Instead, I can again replace k0 with k and k1 with i in Eq. 12.43:

T.2/ D 1 2kk1

� "

i � k2 C k21

� sin k1d C 2kk1 cos k1d i

� k21 � k2

� sin k1d

�i �k21 � k �

sin k1d �i � k2 C k21

� sin k1d C 2kk1 cos k1d

# !

1

2ik

" i � k2 � 2� sin .i d/C 2ik cos .i d/ �i � 2 C k2� sin .i d/

i � 2 C k2� sin .i d/ �i �k2 � 2� sin .i d/C 2ik cos .i d/

# D

1

2ik

" � �k2 � 2� sinh . d/C 2ik cosh . d/ � 2 C k2� sinh . d/

� � 2 C k2� sinh . d/ �k2 � 2� sinh . d/C 2ik cosh . d/

#

(12.60)

At the last step of this derivation, I used identities connecting trigonometric and hyperbolic functions: sin .iz/ D i sinh z and cos .iz/ D cosh z. The elements of this matrix determine amplitude reflection and transmission coefficients for a single- barrier potential, r1 and t1 correspondingly, as established by Eqs. 12.34 and 12.39:

t1 D 2ik .k2 � 2/ sinh . d/C 2ik cosh . d/ (12.61)

r1 D � � 2 C k2� sinh . d/

.k2 � 2/ sinh . d/C 2ik cosh . d/ (12.62)

12.2 Resonant Tunneling 413

Equations 12.61 and 12.62, obviously, can be derived from Eqs. 12.44 and 12.45 for the single-well problem with the same replacements of k0 and k1 used to obtain the T-matrix itself.

In order to simplify further computations and also to provide an easier way to relate the properties of the double-barrier structure to those of its single-barrier components, I am going use Eqs. 12.34 and 12.39 to rewrite the transfer matrix in terms of the amplitude reflection and transmission coefficients, r1 and t1:

T.2/22 D 1

t1 I T.2/21 D �

r1 t1 :

Using the explicit form of the matrix T.2/, Eq. 12.60, you can determine that T.2/11 D� T.2/22

�� and T.2/12 D

� T.2/21

�� , so that the entire T.2/ can be written down as

T.2/ D 1=t�1 �r�1 =t�1

�r1=t1 1=t1 � :

Multiplying this by M.2/ from Eq. 12.58, I get

T.2/M.2/ D

exp .ikw/ =t�1 � exp .�ikw/ r�1 =t�1 � exp .ikw/ r1=t1 exp .�ikw/ =t1

� 1=t�1 r�1 =t�1

�r1=t1 1=t1 � ;

and, finally, multiplying this matrix by T.2/ (from the left), I find the total double- barrier T-matrix T.4/:

T.4/ D 2 4

exp.ikw/

.t�1 / 2 C exp.�ikw/jr1j

2

jt1j2 exp.ikw/r�1

.t�1 / 2 � exp.�ikw/r

�

1

jt1j2

� exp.ikw/r1jt1j2 � exp.�ikw/r1

t21 � exp.ikw/jr1j2jt1j2 C

exp.�ikw/ t21

3 5

D 1jt1j2

2 4

t1 exp.ikw/ t�1

C jr1j2 exp .�ikw/ t1 exp.ikw/r �

1

t�1 � exp .�ikw/ r�1

�r1 exp .ikw/ � t �

1 exp.�ikw/r1 t1

� jr1j2 exp .ikw/C t �

1 exp.�ikw/ t1

3 5 :

This expression can be simplified by introducing

t1 D jtj exp .i't/ r1 D jrj exp .i'r/ ;

which yields

T.4/ D 1jt1j2 �

ei.kwC2't/ C jr1j2 e�ikw r�1 � ei.kwC2't/ � e�ikw�

�r1 � e�i.kwC2't/ C eikw� � jr1j2 eikw C e�i.kwC2't/

� D

414 12 Resonant Tunneling

1

jt1j2 �

2 4 e

i't h ei.kwC't/ C jr1j2 e�i.kwC't/

i 2ir�1 ei't sin .kw C 't/

�2r1e�i't cos .kw C 't/ e�i't h � jr1j2 ei.kwC't/ C e�i.kwC't/

i 3 5 :

(12.63)

At the last step I factored out exp .i't/ to make residual expressions more symmetrical with respect to the phases of the remaining exponential functions and used Euler’s identities cos x D .exp .ix/C exp .�ix//=2/ and sin x D .exp .ix/ � exp .�ix//=2i/. Now you can simply read out the expressions for the total amplitude reflection and transmission coefficients:

tdb D jt1j 2 exp .i't/

� jr1j2 exp .ikw C i't/C exp .�ikw � i't/ (12.64)

rdb D r � 1 exp .2i't/ Œexp .ikw C i't/ � exp .�ikw � i't/�

� jr1j2 exp .ikw C i't/C exp .�ikw � i't/ ; (12.65)

where subindex db stands for the double barrier. I will begin the analysis of the obtained expression with the transmission

probability Tdb D jtdbj2:

Tdb D jt1j 4

ˇ̌ ˇ � 1 � jr1j2

� cos .kw C 't/ � i

� 1C jr1j2

� sin .kw C 't/

ˇ̌ ˇ 2

At this point it is useful to recall that transmission and reflection probabilities obey the probability conservation condition jt1j2 C jr1j2 D 1, which allows to rewrite the expression for Tdb in the simplified form

Tdb D jt1j 4

jt1j4 cos2 .kw C 't/C � 1C jr1j2

�2 sin2 .kw C 't/

: (12.66)

The corresponding expression for the reflection probability becomes

Rdb D 4 jr1j 2 sin2 .kw C 't/

jt1j4 cos2 .kw C 't/C � 1C jr1j2

�2 sin2 .kw C 't/

(12.67)

12.2 Resonant Tunneling 415

Before going any further, it is always useful to check that the results obtained obey the probability conservation condition Rdb C Tdb D 1. To prove that this is indeed true, you just need to demonstrate that

jt1j4 C 4 jr1j2 sin2 .kw C 't/ D jt1j4 cos2 .kw C 't/C � 1C jr1j2

�2 sin2 .kw C 't/ :

You might probably find an easier way to prove this identity, but this is how I did it:

jt1j4 C 4 jr1j2 sin2 .kw C 't/ D jt1j4

� cos2 .kw C 't/C sin2 .kw C 't/

�C 4 jr1j2 sin2 .kw C 't/ D jt1j4 cos2 .kw C 't/C

� 4 jr1j2 C jt1j4

� sin2 .kw C 't/ D

jt1j4 cos2 .kw C 't/C 4 jr1j2 C

� 1 � jr1j2

�2 sin2 .kw C 't/ D

jt1j4 cos2 .kw C 't/C � 1C jr1j2

�2 sin2 .kw C 't/ : (12.68)

Having verified that my calculations are not obviously wrong, I can proceed with their analysis. If you remember that the naive expectation, which I described in the beginning of this section, was that adding a second barrier would result in a total transmission being just a product of the transmission probabilities through each barrier, which in our case of identical barriers would mean Tdb D jt1j4. Looking at Eq. 12.66, you can indeed notice the factor jt1j4 in its numerator, but you will also see that this factor is accompanied by a denominator, which is responsible for breaking our naive expectations. What this denominator does, it selects special energies, namely, the ones obeying the condition

& .E/ D k.E/w C 't.E/ D �n; n D 1; 2; 3 � � � ; (12.69)

which turns sin .kw C 't/ in Eqs. 12.66 and 12.67 to zero. For energies satisfying Eq. 12.69, the reflection coefficient vanishes and the transmission coefficient turns to unity. So much for the second barrier suppressing the transmission probability! In reality, the presence of the second barrier somehow magically helps the quantum particle to penetrate both barriers without any reflection, albeit only at special energies. This phenomenon is called resonant tunneling, and it is a wonderful manifestation of importance of quantum superposition of states or, as one could say, of the wave nature of quantum particles. Energy values at which the resonant tunneling takes place are called tunneling resonances.

To analyze this effect in more details, it is useful to rearrange terms in the denominator of Eq. 12.66. Using identity in Eq. 12.68, I can rewrite the expression for the transmission probability T2 as

416 12 Resonant Tunneling

Tdb D jt1j 4

jt1j4 C 4 jr1j2 sin2 .kw C 't/ D

1

1C 4jr1j2jt1j4 sin 2 .kw C 't/

: (12.70)

This expression makes it even more obvious that every time when the energy of the particle obeys the resonance condition, Eq. 12.69, the transmission turns to unity, but it also reveals the role of the parameter:

� D jt1j 2

jr1j : (12.71)

Indeed, let me find the values of the energy for which the transmission drops to the half of its maximum value, i.e., becomes equal to 1=2. Quite obviously, this happens whenever

4

�2 sin2 & D 1 ” jsin & j D �=2: (12.72)

In the case of the thick individual barriers, when the effect of the resonant transmission is most drastic, the single-barrier transmission jt1j is small, while the reflection jr1j is almost unity. In this case Eq. 12.71 can be approximated as follows:

� D jt1j 2

q 1 � jt1j2

� jt1j 2

1 � jt1j2 =2 � jt1j2

� 1C jt1j2 =2

� � jt1j2 ; (12.73)

where I neglected terms smaller than jt1j2. This approximation shows that � is as small as jt1j2 meaning that according to Eq. 12.72, the value of the phase & .E/ at the energy values corresponding to Tdb D 1=2 only weakly deviates from the resonant value En with & .En/ D �n. Accordingly, & .E/ can be presented as & .E/ D �n C ı&n where ı&n � 1, allowing to simplify Eq. 12.72 as

jsin .ı&n/j � jı&nj D �=2: (12.74)

Thus, parameter �=2 determines the magnitude of the deviation of the phase & .E/ from its resonant value required to bring down the transmission coefficient by half. The smaller the � , the smaller is such deviation, which means, in other words, that smaller � results in steeper decrease of transmission when particle’s energy shifts away from the resonance. Deviation of the phase can be translated into the respective deviation of energy by presenting

& .E/ � & .En/C d& .E/ dE

ıE

12.2 Resonant Tunneling 417

Fig. 12.4 Double-barrier transmission of an electron as a function of energy for the structure with barrier height 1 eV, the distance between the barriers w D 1:2 nm, and three barrier widths: blue line corresponds to d D 0:8 nm, red to d D 0:4 nm, and black to d D 0:2 nm. Energy is given in dimensionless units of 2meEw2=„2

The deviation of the phase equal to �=2 corresponds to the deviation of energy equal to

�E

2 D

d& .E/

dE

��1 �

2 (12.75)

If one plots transmission as a function of particle energy, the resonant values will appear as peaks of the transmission, while parameter �E will determine the width of these peaks. More accurately �E=2 is called the half-width at half-maximum (HWHM). The origin of “half-maximum” in this term is obvious, and half-width refers to the fact that Eq. 12.74 has two solutions ˙�=2, and the total width of the resonance at half-maximum is .En C �E=2/ � .En � �E=2/ D �E. Widening of the barriers results in decreasing � , which can be qualitatively described as narrowing of the resonances. You can observe this phenomenon in Fig. 12.4, presenting transmission as a function of energy for several barrier widths. You can also see that the resonances broaden with increasing energy. This is the result of the energy dependence of the elements of the single-barrier transfer matrix and, correspondingly, of the parameters � and the derivative of the phase d&=dE. The explicit expression for this derivative can be found from Eq. 12.61 for the amplitude transmission coefficient, but the result is rather cumbersome and can be left out.

This figure reveals that parameter � also determines how small the transmission becomes between the maximums and, therefore, how prominent the resonances are. In order to see where this effect comes from, it is useful to rewrite Eq. 12.70 for transmission as

Tdb D .�=2/ 2

.�=2/2 C sin2 .kw C 't/ (12.76)

One can see now that the minimum of transmission, which occurs whenever sin .kw C 't/ reaches its largest value of unity, is

418 12 Resonant Tunneling

T.min/db D �2

�2 C 1 � � 2

where I assumed at the last step that � � 1, i.e., it increases with increasing � . You may also notice that the position of the resonances is different for different barrier thicknesses. This result seems to be contrary to Eq. 12.69, which shows explicitly only the dependence of the resonant energies on the distance between the barriers, w. The observed effect of the dependence of the resonances on d emphasizes the role of the phase factor 't, which does depend on the thickness of the barriers, but is often overlooked.

In the vicinity of the resonance &n D �n, one can expand the sin .&/ as

sin .& � &n C �n/ D .�1/n sin .& � &n/ � .�1/n .& � &n/ � .�1/n .d&=dE/ .E � En/ :

With this approximation Eq. 12.76 for transmission can be presented in the vicinity of the resonance as

Tdb D .�E=2/ 2

.�E=2/ 2 C .E � En/2

(12.77)

Resonance behavior of this type occurs frequently in various areas of physics and is called a Breit–Wigner resonance, while Eq. 12.77 bears the name of a Breit–Wigner formula.1

The treatment of the resonant tunneling, which I have developed, is remarkably independent on the details of the shapes of the barriers constituting the double- barrier structure. As long as the boundaries of the barriers are clearly defined so that I can write down a single-barrier transfer matrix T.2/ and the distance between the barriers, w, I can use the results of this section. Do not get me wrong—the parameters of T.2/, of course, depend on the details of the barrier’s shape, but what I want to say is that T.2/ can be computed independently of the double-barrier problem once and for all, numerically if needed, and then used in the analysis of the resonant tunneling.

So, I hope you are convinced by now that the resonant tunneling is a remarkable phenomenon, which can be relatively simply described in terms of the reflection and

1Gregory Breit was an American physicist, known for his work in high energy physics and involvement at the earlier stages of the Manhattan project. Eugene Wigner was a Hungarian- American theoretical physicist, winner of the half of 1963 Nobel Prize “for his contributions to the theory of the atomic nucleus and the elementary particles, particularly through the discovery and application of fundamental symmetry principles.” In 1939 he participated in a faithful Einstein- Szilard meeting resulting in a letter to President Roosevelt prompting him to initiate work on development of atomic bombs. You might find this comment of his particularly intriguing, “It was not possible to formulate the laws of quantum mechanics in a fully consistent way without reference to consciousness,” which he made in one of his essays published in collection “Symmetries and Reflections – Scientific Essays (1995).”

12.2 Resonant Tunneling 419

transmission coefficients of a single barrier. Still, you might feel certain dissatisfac- tion because all these calculations do not really explain how passing through two thick barriers instead of one can improve the probability of transmission, leave alone make it equal to one. They also do not clarify the role of the quantum superposition, which, I claimed, played a crucial role in this phenomenon. There are several distinct ways to develop a more qualitative, intuitive understanding of this situation. First is naturally based on thinking about quantum mechanical properties of the particle in terms of waves, their superposition and interference. To see how these ideas play out, consider an expression for the amplitude reflection coefficient rdb, which determines the relative contribution of the backward propagating wave in the wave function representing the state of the particle in the region z < 0:

.z/ D exp .ikz/C rdb exp .�ikz/ :

A careful look at Eq. 12.65 reveals that this expression describes a superposition of two waves, both propagating backward, but with different phases. The origin of these contributions is the multiple reflections of the waves representing the particle’s state between the boundaries of both barriers (this is why the second barrier is crucial for this effect to occur). The only terms contributing to the phase difference between them are exp .ikw C i't/ and exp .�ikw � i't C i�/, where the extra i� in the argument of the exponent takes care of the negative sign appearing in front of this expression in Eq. 12.65. The phase difference between these contributions to the reflected (backward propagating) component of the wave function is 4 D 2kw C 2't C � , and if we want to suppress reflection by destructive interference, we must require that 4 D � C 2�n, which results in exactly the condition for the transmission resonance kw C 't D �n.

It is also instructive to take a look at the spatial dependence of the probability density j .z/j2 for resonant and off-resonant values of energy. The analytical expression for this quantity is quite cumbersome, especially off the resonance, so I will spare you from having to suffer through its derivation, presenting instead only the corresponding graphs obtained for the same values of the parameters as in Fig. 12.4 for off- and on-resonance values of the particle’s energy. The first two graphs in Fig. 12.5 correspond to the value of energy smaller and larger than the energy of the first tunneling resonance. In both cases you can observe oscillations of the probability in the region z < 0 due to interference between incident and reflected waves. You should also notice that the relative probability to find the particle in front of the barrier at the maximums of the interference pattern significantly exceeds the probability to find the particle between the barriers or behind them (the right boundary of the second barrier can be clearly identified from the graphs by the absence of any interference pattern in the transmitted wave) for energies both below and above the resonance. The situation, however, changes completely at the resonance (the last graph in the figure). The most remarkable feature of this graph is a pronounced increase of the likelihood that the particle is located between the barriers. If we are dealing with a beam of many electrons incident on the structure, this effect will result in an accumulation of electrons between the barriers making

420 12 Resonant Tunneling

Fig. 12.5 Spatial dependence of the probability density j .z/j2 for energies below, above, and equal to the energy of the first tunneling resonance. Parameters of the double-barrier structure are the same as in Fig. 12.4 with the barrier width d D 0:4 nm

this region strongly negatively charged. Electric field associated with this strong charge will repel incoming electrons making it more difficult for additional electrons to penetrate the barriers. This effect, called Coulomb blockade, can be noticed as increase in the number of reflected electrons as we increase the density of electrons in the beam. For very small distance between the barriers, the effect of Coulomb blockade can be so strong that even a single electron is capable of preventing other electrons from entering the structure. Thanks to this phenomenon, physicists and engineers gain ability to count individual electrons and develop single-electron devices.

It is important to notice that the resonance probability distribution featured in Fig. 12.5 corresponds to the smallest of the resonance energies, which satisfies Eq. 12.69 with n D 1. Now I want you to take a look at the probability distributions corresponding to resonance energies satisfying Eq. 12.69 with n D 2 and n D 3 presented in Fig. 12.6.

Ignore for the second that the functions depicted in Figs. 12.5 and 12.6 do not vanish at infinity, and compare them to those shown in Fig. 6.6 , which present the

12.2 Resonant Tunneling 421

Fig. 12.6 Spatial dependence of the probability density j .z/j2 at the resonance energies of the second and third order (n D 2; 3)

wave functions corresponding to the first three bound energy levels in a rectangular potential. Taking into consideration the obvious difference stemming from the fact that the graphs in Fig. 6.6 are those of the real-valued wave functions, while Figs. 12.5 and 12.6 depict j .z/j2, you cannot help noticing the eerie resemblance between the two sets of graphs. You might also notice that the resonance condition, Eq. 12.69, resembles an equation for the energy eigenvalues of the bound states. Actually, in the limit d ! 1, this equation must exactly reproduce Eq. 12.50 with obvious replacements d ! w and Vw ! 0, and it would be interesting to demonstrate that. Will you dare to try? Do not get deceived by the term kw in Eq. 12.69, which might make you think about the bound states of an infinite potential well. The finite nature of the potential barriers arising in the limit d ! 1 is hidden in the phase term 't; which shall play the main role when recasting Eqs. 12.69 into the form of Eq. 12.50.

Anyway, this similarity between resonance wave functions and those of the bound states offers an alternative interpretation of the phenomenon of the resonant tunneling. Imagine that you start with a potential, in which the barriers are infinitely thick, so that you can place a particle in one of the stationary states of the respective potential well. Then, using a magic wand, you reduce the thickness of the barriers to a finite value. What will happen to the particle in this situation? Using what we learned about the tunneling effect, you can intuit that the particle will “tunnel out” of the potential well and escape to the infinity. In formal mathematical language, this can be rephrased by saying that the boundary condition for the corresponding Schrödinger equation at z ! ˙1 must take a form of a wave propagating to the right (exp .ikz/) for z ! 1 and a wave propagating to the left (exp .�ikz/) for z ! �1. These boundary conditions differ from the ones we used when deriving the transmission and reflection coefficients by the absence of the wave exp .ikz/ incident on the potential from negative infinity. Correspondingly, the wave function at z < 0 and z > 2d C w is now presented by the column vectors

422 12 Resonant Tunneling

v0 D 0

r

� I v4 D

t 0

�

similar to the bound state problem. Also, similar to the bound state problem, you will have to conclude that the transfer-matrix equation T.4/v0 D v4 in this case has non-zero solutions o