Tuesday, March 14, 2023

How Appium tests Android apps step by step

As an Android mobile app developer learning the Appium test framework, I was initially confused about all of the talk of a "web server" and "WebDriver". I used ChatGPT to help me understand the role of a the web server in the test execution process.

Appium Test Code Example

Suppose we have an Android app called MyTaskList and we want an Appium script to click a button "Add Task" when MyTaskList has launched.
import io.appium.java_client.MobileElement;
import io.appium.java_client.android.AndroidDriver;
import org.openqa.selenium.By;
import org.openqa.selenium.remote.DesiredCapabilities;
import java.net.URL;

public class MyTaskListTest {
    public static void main(String[] args) throws Exception {
        // Set up desired capabilities
        DesiredCapabilities caps = new DesiredCapabilities();
        caps.setCapability("deviceName", "Android Emulator");
        caps.setCapability("platformName", "Android");
        caps.setCapability("appPackage", "com.example.mytasklist");
        caps.setCapability("appActivity", ".MainActivity");

        // Create a new instance of the AndroidDriver
        AndroidDriver<MobileElement> driver = new AndroidDriver<MobileElement>(
            new URL("http://127.0.0.1:4723/wd/hub"), caps);

        // Find and click on the "Add Task" button
        MobileElement addButton = driver.findElement(By.id("com.example.mytasklist:id/add_task_button"));
        addButton.click();

        // Close the driver session
        driver.quit();
    }
}

Architecture

What is the "architecture" of the hardware and softare components that come together to perform this test?

When using Appium for Android mobile development, there are two main components involved: the developer's computer and the Android device (either a physical device connected via USB or an emulator).

The developer's computer runs the Appium server, which listens for incoming connections from the test script and communicates with the Android device over the Android Debug Bridge (ADB). The test script is executed on the developer's computer and uses the Appium client library to automate interactions with the Android device, such as launching the app, tapping on UI elements, and verifying app behavior.

The Android device, whether a physical device or an emulator, is used to execute the app and receive commands from the Appium server. The Appium client library sends commands to the Appium server, which in turn forwards them to the Android device over ADB. The device executes the commands and sends responses back to the Appium server, which then forwards them to the client library.

So in summary, Appium for Android mobile development involves a physical computer (the developer's computer) and an Android device (either physical or emulator) that communicate with each other over ADB, with the Appium server acting as a mediator between them.

Execution Steps

Explain the sequence of steps to execute this code, carefully explaining where each piece of software runs (i.e. which device):
  1. The developer writes an Appium test script using a programming language like Java, Python, Ruby, or JavaScript on their computer. The script uses the Appium client library to send commands to the Appium server.
  2. The developer starts the Appium server on their computer. The server listens on a specified IP address and port number for incoming connections from the Appium client.
  3. The developer connects the Android device to their computer using a USB cable and enables USB debugging mode on the device.
  4. The developer installs the MyTaskList application on the Android device using the Android Debug Bridge (ADB) command-line tool on their computer.
  5. The developer runs the Appium test script on their computer by executing the script using a tool like Maven or Gradle or by running it directly from an IDE like Eclipse or IntelliJ IDEA.
  6. The Appium client in the test script sends a command to the Appium server to start a new session with the MyTaskList Android app.
  7. The Appium server starts the MyTaskList Android app on the connected device and installs the Appium bootstrap application on the device.
  8. The Appium server establishes a socket connection with the bootstrap application on the device over the ADB.
  9. The Appium client sends a command to the Appium server to find the button element with the text "Add Task" in the MyTaskList Android app.
  10. The Appium server sends a command to the bootstrap application over the socket connection to find the button element using the UIAutomator framework on the device.
  11. The bootstrap application uses the UIAutomator framework to find the button element with the text "Add Task" and sends the element information back to the Appium server over the socket connection.
  12. The Appium server sends a command to the bootstrap application over the socket connection to click on the button element using the UIAutomator framework on the device.
  13. The bootstrap application uses the UIAutomator framework to click on the "Add Task" button in the MyTaskList Android app and sends the result back to the Appium server over the socket connection.
  14. The Appium client in the test script receives the result of the click command and continues with the next step in the test script.

Conclusion

Hold up, that didn't explain why "web server" is all over the Appium documentation for Android native mobile testing. So I asked ChatGPT again and figured out, basically, the Appium server is a web server. The Appium server can be thought of as a web server that implements the WebDriver protocol. It listens for incoming HTTP requests from the client library, processes those requests, and communicates with the Android device (or iOS device) to perform the desired actions. Appium server uses the WebDriver protocol, which is a RESTful API based on JSON Wire Protocol. The client library sends commands as HTTP requests to the Appium server, and the server responds with HTTP responses. This allows Appium to provide a language-agnostic API for test scripts, as the client libraries can be implemented in various programming languages, all communicating with the Appium server using the same HTTP-based protocol. In summary, the Appium server can be thought of as a web server that implements the WebDriver protocol to enable communication between the test script (running on the developer's computer) and the mobile device (either an Android or iOS device). This communication is done using HTTP requests and responses, with the Appium server acting as a mediator to facilitate the automation of mobile app testing.