Selenium problems (and how to solve them)

Selenium problems (and how to solve them)

Selenium library is a very popular standard. It has many advantages, but it also has many problems. In this article, I want to list all Selenium problems and try to address them one by one. At first, we need to define, what are the most important problems:

Hard to select elements on the web page

This problem is not related strictly with the Selenium library. It is related mostly with a poor semantic description of the element on the application views.

If our page structure looks like:

<div>
    <span>...</span>
    <span>...</span>
    <span>
        <span>...</span>
        <span>...</span>
    </span>
    <span>...</span>
    <span>...</span>
</div>

Then it may be complicated to select particular elements on the page. Even if we create such a selector, then it would be very sensitive for changes. This is example selector from Facebook. It is very complex and hard to maintain.

div._61xb:nth-child(2) > div:nth-child(1) > div:nth-child(1) > div:nth-child(2) > div:nth-child(2) > span:nth-child(1) > span:nth-child(1)

Solution

The best way to solve it is to talk with developers to add some prefixed classes (i.e with e2e- prefix from end to end tests) to all important elements on the web page. It is very easy for developers, but definitely increase the quality of test maintenance. We can also add custom attributes like data-test-id

<div>
    <span>...</span>
    <span>...</span>
    <span>
        <span class="e2e-pay-button">...</span>
        <span data-test-id="pay-button">...</span>
    </span>
    <span>...</span>
    <span>...</span>
</div>

Then we can simply select this element using this simple, short and focused selector.

.e2e-pay-button

or

[data-test-id=pay-button]

in the second option.

Marking classes with custom prefix help to maintain HTML code, because it is easily recognizable what is the purpose of this class.

Repeatability of tests

Complex applications, which use databases to store the information are hard to tests because they need the multilevel configuration of test data to test one end to end test case. Moreover, if the resources in your application are limited (e.g. limited places for the bus in the reservation system) your tests can’t be executed multiple times for the same test data.

Solution

To deal with this we need to have a fixed set of test data, which needs to be executed on the testing environment. The CI environment with full database recreation and clean application deploy are good for this purpose.

Moreover, to let repeat your tests you can use snapshots. This is a database level feature to save the current state of DB, do some actions and restore saved state at the end of testing. This is described in this article.

Speed of testing

Because Selenium tests are executed using the real browser and proxy to interpret the automation commands, they are quite slow. Execution of all tests can take several minutes, which can be problematic in continuous integration pipeline with many commits and builds every day.

Solution

To make the test execution total time shorter, we can do 2 things:

  • add a category to tests: necessary and additional. This attribute will group all tests into subsets: base test scope and full test scope
  • we can define, those necessary tests are executed on all branches, but all tests are executed only when we merge the code to main (e.g. master branch).

Hard to read

Test readability is a real pain when we want to maintain tests and understand, what they do after some time from its creation.

Usually tests looks like below:

var driver = new ChromeDriver();
driver.Navigate().GoToUrl(testUrl);
var selectedLineCss = driver.FindElement(By.CssSelector("body > div > div.row.poem1 > div > ul > li.list-group-item.selected"));
Assert.AreEqual(selectedLineCss.Text, "And there is another sunshine,");
var selectedLineByAttribute = driver.FindElement(By.CssSelector(".list-group-item[title='flowers']"));
Assert.AreEqual(selectedLineByAttribute.Text, "In its unfading flowers");
var selectedLineCssSimple = driver.FindElement(By.CssSelector(".poem1 .list-group > .list-group-item.selected"));
Assert.AreEqual(selectedLineCssSimple.Text, "And there is another sunshine,");
var selectedLineXPath = driver.FindElement(By.XPath("/html/body/div/div[2]/div/ul/li[3]"));
Assert.AreEqual(selectedLineXPath.Text, "And there is another sunshine,");


var poemElement = driver.FindElement(By.ClassName("poem1"));
var selectedByNested = poemElement.FindElement(By.CssSelector(".list-group > .list-group-item.selected"));
Assert.AreEqual(selectedByNested.Text, "And there is another sunshine,");


var allLines = poemElement.FindElements(By.ClassName("list-group-item"));
Assert.AreEqual(allLines.Count, 14);
var disabledLine = poemElement.FindElement(By.ClassName("disabled"));
Assert.AreEqual(disabledLine.GetAttribute("title"), "brother");


driver.Quit();

Details of HTML structure is mixed with the logic of actions. It is very hard to read.

Solution

We can use PageObjects and Flows to structure our tests in proper, easy to read form.

PageObject is a place for all HTML page-related elements, and Flows are placed for all business logic.

public class LoginPage
{
    public IWebElement Email { get; set; }
    public IWebElement Password { get; set; }
    public IWebElement LoginButton { get; set; }
    public IWebElement ForgotPasswordButton { get; set; }
}


public class ForgotPasswordPage {
    public IWebElement Email { get; set; }
    public IWebElement SendButton { get; set; }
}


public class LoginFlow
{
    private LoginPage _loginPage;
    private ForgotPasswordPage _forgotPasswordPage;


    public LoginFlow()
    {
        _loginPage = new LoginPage();
        _forgotPasswordPage = new ForgotPasswordPage();
    }


    public void Login(string user, string password)
    {
        _loginPage.Email.SendKeys(user);
        _loginPage.Password.SendKeys(password);
        _loginPage.LoginButton.Click();
    }


    public void ForgotPassword()
    {
        _forgotPasswordPage.Email.SendKeys("");
        _forgotPasswordPage.SendButton.Click();
    }
}




public Test()
{
    _loginFlow.ForgotPassword();
    _loginFlow.Login("user", "pass");
    _bookingFlow.FillData();
    _bookingFlow.Book();
}

As you can see on the code above, Flows make the test very readable and straightforward.

Waiting for each element

Waiting in Selenium is a big pain. Because in modern web applications, all data are taken from server in an asynchronous way, we can’t be sure, when exactly they would be visible on the page. That may cause the problem with tests, which pass and fail randomly.

Solution

The most comfortable way to solve it is to write your custom adapters for all Selenium methods. They need to wait for all the elements. We can use Lazy class in C# for this purpose.

get
{
    return new Lazy<IWebElement>(() =>
    {
        WebDriverWait wait = new WebDriverWait(_driver, TimeSpan.FromSeconds(5));
        return wait.Until(ExpectedConditions.ElementIsVisible(By.Id("input1")));
    });
}

The other option is to use out of the box library integrated with the framework used in our application like Protractor for Angular. It handles all waiting because it knows internal Angular architecture.

Work with custom web controls

Usually, it is hard to work with custom JS controls. Especially, if they are asynchronous and use complex HTML structure. They are used very widely: date time inputs, calendars, grids.

Solution

There is no shortcut here. We need to handle all operations on custom controls manually, but we can define general Element classes to work with such elements in many places using one common interface.

grid.Item[1,2].SendText("aaa")
grid.Item[2,3].Click()

It is easier to read and maintain because all HTML logic is placed in a single place.

You can also use some of the .NET libraries to support operation with web controls like Selenium.Utils

Poor troubleshooting

When we execute our tests remotely, it is quite hard to check, what is wrong. We have a call stack, but no information about the application state.

Solution

We can implement the logic to take application screenshots when a test fails. 

We can implement the logic to take application screenshots when a test fails. 

public static void TakeScreenshot(this IWebDriver chromeDriver, string testName)
{
    OpenQA.Selenium.Support.Extensions.WebDriverExtensions
        .TakeScreenshot(chromeDriver)
        .SaveAsFile(_relativePath + 
            string.Format("{0}_{1:yyyy-MM-dd-hh-mm-ss}.acceptance.png", 
            testName, DateTime.Now), 
            ImageFormat.Png);
}


[TearDown]
public void TearDown()
{
    var state = TestContext.CurrentContext.Result.Outcome;
    if (state == ResultState.Error || state == ResultState.Failure)
    {
        _driver.TakeScreenshot(TestContext.CurrentContext.Test.FullName);
    }
    
    _driver.Quit();
}

Moreover, we can add custom error messages to each asset in code.

Assert.IsNotNull(element, "Custom message");

Summary

This short list shows you the most important (and painful) problems in Selenium. However, I want to show you that none of them is a blocker for using Selenium end to end testing in your projects.

However, there exist other tools, which I describe in the next post.