Skip to main content

2024 | Buch

Artificial Intelligence

What Is Behind the Technology of the Future?

insite
SUCHEN

Über dieses Buch

Artificial Intelligence (AI) is already present in our daily routines, and in the future, we will encounter it in almost every aspect of life – from analyzing X-rays for medical diagnosis, driving autonomous cars, maintaining complex machinery, to drafting essays on environmental problems and drawing imaginative pictures. The potentials of AI are enormous, while at the same time many myths, uncertainties and challenges circulate that need to be tackled.
The English translation of the book “Künstliche Intelligenz – Was steckt hinter der Technologie der Zukunft?” originally published in German (Springer Vieweg, 2020), this book is addressed to the general public, from interested citizens to corporate executives who want to develop a better and deeper understanding of AI technologies and assess their consequences.
Mathematical basics, terminology, and methods are explained in understandable language. Adaptations to different media such as images, text, and speech and the corresponding generative models are introduced. A concluding discussion of opportunities and challenges helps readers evaluate new developments, demystify them, and assess their relevance for the future.

Inhaltsverzeichnis

Frontmatter
Chapter 1. What Is Intelligent About Artificial Intelligence?
Abstract
Recently, the term Artificial Intelligence (AI) came into the focus of public discussion. An Artificial Intelligence system is supposed to be able to perceive its environment and behave intelligently, similar to humans. However, this definition is imprecise because the term “intelligence” is difficult to delineate. Therefore, this chapter discusses the individual dimensions of AI. Most AI systems are tasked with associating an input (e.g., an image) with an output (e.g., a class of images objects). Inputs and outputs are represented by sets of numbers. This mapping is not manually programmed, but successively adapted and trained based on observations and data. This process is also called “learning”.
Gerhard Paaß, Dirk Hecker
Chapter 2. What Are the Capabilities of Artificial Intelligence?
Abstract
In recent years, advances in computer processing power and the availability of suitable programming environments and algorithms have made it possible to solve some Artificial Intelligence subtasks in a satisfactory manner. This chapter provides an informal overview of the state of the art. Of particular importance here is the interpretation of sensor data, such as recognizing objects in photos, diagnosing diseases from images, or transcribing spoken language into text. There has also been progress in analyzing the meaning of language, e.g., machine translation from one language to another, answering questions by an AI system, or conducting meaningful dialogues by intelligent assistants. Finally, AI systems have been able to beat human experts at computer games, automatically drive vehicles in real-world traffic, or perform creative acts, such as inventing new stories. The techniques used will be explained in later chapters.
Gerhard Paaß, Dirk Hecker
Chapter 3. Some Basic Concepts of Machine Learning
Abstract
Machine Learning is tasked with extracting relationships from data that are relevant to the user. In this chapter, we formulate a simple linear model, the logistic regression model, which predicts the corresponding output for given inputs. The goal is to automatically find the relations between the existing input values and the output category in the data. For this purpose, a large number of numerical parameter values are modified step by step using a simple optimization procedure in such a way that the predicted outputs successively approach the correct outputs. The chapter describes the necessary procedure in all details, but with a minimum of mathematical formulas. The more complex models of the following chapters are built according to exactly the same scheme and use the investigated linear model as a universal building block.
Gerhard Paaß, Dirk Hecker
Chapter 4. Deep Learning Can Recognize Complex Relationships
Abstract
For more complex problems, simple linear models are insufficient. A way out is offered by models with several nonlinear layers (operators), which can represent arbitrary “curved” relationships between inputs and outputs. This chapter describes the properties of such deep neural networks and shows how to find the optimal parameters using the backpropagation method. It then discusses the problem of overfitting and how it can be solved using regularization methods. Finally, an overview of the different types of deep neural networks is given and methods for finding a network structure are outlined.
Gerhard Paaß, Dirk Hecker
Chapter 5. Image Recognition with Deep Neural Networks
Abstract
Image recognition is about finding automatic methods to identify objects and their arrangement in an image or photo. This includes classifying the image objects and determining their position in the image. The majority of DNNs for image processing are Convolutional Neural Networks (CNN). They use layers with small receptive fields (convolutions), which are shifted over the pixel matrix of the input image. They are capable of detecting local image features. In addition, pooling layers are used to aggregate features locally. Modern CNNs contain hundreds of these layers, which can successively recognize more complex image features. Some of them make fewer image classification errors than humans. Special variants have been developed to determine the position of objects in images with pixel precision. Finally, models for estimating the inaccuracy of image classifications are presented, and the influence of image distortions and intentional image manipulation on classification accuracy is discussed.
Gerhard Paaß, Dirk Hecker
Chapter 6. Capturing the Meaning of Written Text
Abstract
The vast majority of information in our society is available as written text. Therefore, this chapter describes the extraction of knowledge from written text. In deep neural networks (DNN), words, sentences and documents are usually represented by embedding vectors. While simple embedding creation methods can only be used to approximate the meaning of words, recurrent neural networks (RNN) have the potential to capture the meaning of a sentence. The best known RNN, Long Short-Term Memory (LSTM), can be used as a language model. It predicts the next word in a sentence and can thus acquire the syntactic and semantic structure of a text. Among other things, it can be used to translate from one language to another. The BERT model calculates the “correlation” between the embeddings of all words of a text and derives context-sensitive embedding vectors that grasp much finer nuances of meaning. It is pre-trained on a large text dataset in an unsupervised manner and then adapted to specific tasks on a small labeled dataset. These models have now been shown to nearly match or exceed human performance for a wide variety of semantic tasks. The transformer model extends this approach to translation and generation of texts and other sequences. Further sections are devoted to the description of images by text and the explanation of predictions of deep neural networks.
Gerhard Paaß, Dirk Hecker
Chapter 7. Understanding Spoken Language
Abstract
This chapter describes models for speech recognition, i.e. for transferring spoken language into text. Speech recognizers use derived sound features for small time intervals as input. For speech processing, deep sequence-to-sequence models based on LSTM or transformers are used, which generate the recognized text. Alternatively, Convolutional Neural Networks are employed. A hybrid model of Sequence-to-Sequence and CNN models is able to achieve a lower recognition error than humans. When generating speech from text, WaveNet’s dilated CNN layers can reproduce the acoustic speech of a speaker extremely faithfully. Voice assistants, such as Siri and Alexa, allow users to engage in a dialogue. An example system is used to illustrate the construction of a variant of the Alexa voice assistant from subnetworks and other components. The classification of the events in a video is possible with variants of spatio-temporal convolutional layers. More difficult is the description of videos by subtitles, which can be done for example with the help of transformer translation models. In a last section, the influence of noise on speech recognition and the potential danger of adversarial attacks are discussed.
Gerhard Paaß, Dirk Hecker
Chapter 8. Learning Optimal Policies
Abstract
Reinforcement learning is an area of Machine Learning in which a software program (agent) must select an action at each time step with the goal of achieving the highest possible sum of rewards over time. An action is determined based on the current state and affects the reward, often many time steps later. Examples applications include games, robotic controls, and self-driving cars. Deep neural networks (DNNs) can be used to assign a sum of expected rewards to a state-action pair. DNNs are particularly suitable because they can approximate the underlying functions well and may be employed to select the best action for a state. Q-Networks predict an expected reward sum for each state-action pair. A stochastic policy is suitable for decision situations with random influences and determines for each state an optimal probability distribution over the possible actions. For both types of models, training procedures are derived which determine the gradient from a number of simulated model runs. In contrast to previous DNN, the training data is generated during training using a simulated or real environment. Finally, application areas of reinforcement learning are described, such as video games, robot control, and autonomous vehicles.
Gerhard Paaß, Dirk Hecker
Chapter 9. Creative Artificial Intelligence and Emotions
Abstract
This chapter shows that deep neural networks (DNNs) can creatively generate novel images, text, music, and dialogue. In the case of images, generative adversarial networks (GANs) are able to create images with specific properties or style features. In addition, they can convert images from one type to another, such as a photo to a painting. For authoring texts, there are language models that can invent new complex stories and formulate them in fluent language. Music DNNs are trained with the notes of musical pieces and can “compose” new pieces of music that experts say achieve good quality. At the end of the chapter, intelligent speech assistants are discussed, which are able to recognize the emotional state of the conversation partner in his or her dialog contributions. They can respond appropriately, providing creative and focused answers so that the counterpart feels understood and is motivated to continue the conversation. In Asia, there are chatbots of this kind with hundreds of millions of users.
Gerhard Paaß, Dirk Hecker
Chapter 10. AI and Its Opportunities, Challenges and Risks
Abstract
Artificial intelligence has established itself as a central trend topic in the global technology industry in recent years. It is realized through deep neural networks and offers a wide range of opportunities and innovation potential, for example in the smart home, in medicine and in industrial applications. AI has a huge impact on economic development and our working world and poses major challenges to society. Internet corporations are using AI in a variety of ways and, by providing platforms, have now built up monopoly-like structures that concentrate large areas of value creation in the US. Profound changes in the labor market are also to be expected, which can only be mitigated by increased education and training efforts. AI systems potentially allow detailed surveillance of large segments of the population, and elaborate legal and organizational regulations are needed to guarantee citizens’ liberties. Researchers and policymakers have therefore developed a testing strategy for an “AI seal of approval”, which should guarantee that AI systems deliver the desired results in a verifiable manner. In addition, the systems should not disadvantage any population group, function robustly, respect privacy and be secure against attacks or accidents.
Gerhard Paaß, Dirk Hecker
Backmatter
Metadaten
Titel
Artificial Intelligence
verfasst von
Gerhard Paaß
Dirk Hecker
Copyright-Jahr
2024
Electronic ISBN
978-3-031-50605-5
Print ISBN
978-3-031-50604-8
DOI
https://doi.org/10.1007/978-3-031-50605-5

Premium Partner