Engineering

App performance hacks: Troubleshooting like a pro

azhar altaf
hero-app-performance-hacks

Regardless of how mature an organization is and how well its software has been developed, there are chances there will be performance issues at some point.

Why is app performance important to organizations?

Organizations spend a lot of resources, both in terms of money and time, to develop software that should make money back for them. Whenever there are performance issues, the organizations suffer in the following ways:

  • Increased costs: To keep up with the load, on-premises servers are upgraded and cloud server classes are increased, which incurs costs.
  • Reduced customer satisfaction and loyalty: If your applications are not niche (like banking applications), users will try to look for alternative applications to fulfill their needs.
  • Reduced conversion rates: If general purpose do not perform well, word spreads fast and user onboarding can slow down.
  • Reduced profits: Slow applications can cause loss of business when users can’t do what they need to.
  • Reduced workplace innovation: Every performance issue can cause organization stress, and pinpointing root causes can slow the development of new applications. If it happens frequently, the organization may lose competitive advantage.

Overview of the troubleshooting process

Whenever performance issues occur, rushing to upscale the servers and stressing about the problem aren’t necessary. A methodic approach is required instead because many times, upscaling provides only temporary relief, and the issues arise again if there are no fixes.

The performance issue resolution process is divided into two main phases: troubleshooting analysis and implementation.

troubleshooting process

  1. Symptom: Define the faulty behavior, addressing what is happening, when it is happening, and how it is manifesting.
  2. Hypothesis: Formulate a possible cause for the issue, which might involve faulty code, misconfiguration, or compatibility issues.
  3. Tool: Use the appropriate tool to analyze the data necessary to resolve the issue.
  4. Root cause: Utilize the correct tool and analyze the resulting data to identify the definitive cause of the issue.
  5. Fix: Implement the corrective action.
  6. Test: Validate that the fix resolves the original symptom and introduces no new problems.

Note: This is an iterative process and fixing one issue may raise other issues.

Typical issues in app performance

Some of the issues that are usually cause of concern for both backend and frontend stacks are as follows.

  • Slow screen transitions/flickering
  • Slow data fetch
  • Slow application load (specific to mobile apps)

Slow screen transitions/flickering

The reactive stack that includes the mobile apps developed on OutSystems platform rely heavily on best practices of the underlying technologies.

Many times, the functionality provided by the platform is not used correctly, causing issues in terms of UI/UX of the applications.

Incorrect use of OnInitialize/OnReady

These events are supposed to be lightweight and should not be doing much more than the initialization of parameters, javascript among others. No server communication should exist in these events.

90 different elements are used inside the oninitialize event

90 different elements are used inside the OnInitialize event.

event with server communication and data-fetching

Server communication and data-fetching happening in this event

These are the first events that are triggered when a user clicks on a link/button to start the loading of the page into the DOM. Having too many elements here or any server communication causes the user to assume the navigation isn't happening and may cause multiple page loads.

Recommendation: These events should be used for setting variables and initialization of javascripts

Lack of knowledge of page life cycles

React stack uses multiple page events that are used for loading the new page into the DOM and displaying it to the user. If the developers are not familiar with the page life cycle events, they can introduce performance issues that can impact the client side and also the server side (servers and databases).

calling OnRender

The OnRender is called after every aggregate and is used in the screens/blocks

Having elements inside OnRender can cause flickering on screens as it will be called every time the data fetch actions return data which needs the DOM to be repainted.

Recommendation: OnRender elements should be left empty unless there is a need to perform some functionality.

Slow data fetch: Client side

Data is the key aspect of the applications we develop. At times, it can cause applications to crawl if there is no optimization.

Server requests not optimized

Too many calls made to the server can slow down the application. If multiple actions need to be called, use Data Actions instead of putting everything into a single data action which calls multiple server actions internally.

Using multiple data actions only works well if the actions are not dependent on each other and perform their individual actions asynchronously.

The data fetched from the actions should only return data that is needed on the screens. The platform will not do any optimization if Server/Service actions return entities as output.

Data Actions return only data needed

Recommendation: It is best to create multiple actions that return different structures for your use cases. The platform will optimize the database query to fetch only the fields that are set in the structures.

Slow integrations

External integrations can cause your applications to experience performance issues. Each slow API call can cause the threads to wait for a response until the platform timeout. The platform timeout does not mean the API call will stop on the external system; it will continue to be processed until their set timeout is reached.

external system response time

Recommendations: Do not change the module or consumed API timeout because external APIs are slow. Use the Circuit breaker pattern from the forge to call the external APIs selectively.

Not using caching

Caching comes in handy when the data rarely changes. This works well for Master Data or configuration data. This can prevent database calls or heavy server actions which take time for process.

action 1 cache in minutes

Recommendation: Implement caching for elements that rarely change. You can always refresh the cache if something changes in the data.

Heavy Local Storage (Mobile Apps)

Many times we mimic the server local storage in local storage. We need to be mindful of what data is needed and how data is stored in local storage. Local storage uses device or browser storage to save the data.

Local storage is meant to be lightweight and should only have the data that is needed.

lightweight local storage

Recommendation: Create local storage in a denormalized form whereby there is less need to join with other local entities to improve performance. Do not fetch everything from the server. Only fetch data that is relevant for your day to day operations.

Slow data fetch: Server side

Back-end performance centers around efficient database querying and action optimization.

Server/service actions not optimized

When these actions aren’t optimized:

  • Connected aggregates/SQL perform multiple DB calls which are significantly impacted by network latency.
  • Aggregates/SQL performed in a loop. The platform suggests combining these into a single query if they depend on a previous query result.
aggregates in a single query

Recommendations: Combine the aggregates into a single aggregate. If combining is not possible, convert aggregate into Advanced SQL.

Fetching more data than needed

This issue can be caused by:

  • Not limiting the numbers of rows that are needed. Leaving the Max Records empty can put stress on the database servers. In situations where a single record is needed, setting the max records to 1 should be used, and in situations where batches are being handled (like timers) setting the max records to a 100 should suffice.
  • Fetching all columns instead of only the necessary ones. Aggregates in server/service actions are not optimized based on the output if the entire entity is returned.
  • Sometimes the platform will automatically create a JOIN to tables because they have a 1:1 relationship. These tables should be removed from aggregates if they are not actually needed.
specify number of records and columns

Recommendations: Always specify the number of records and columns to fetch. The platform will optimize the query to fetch limited columns when you use structures over entities as output parameters.

Unoptimized data model

An unoptimized data model slows fetching through:

  • Missing indexes: Indexes act as a pointer to the data that is being queried. Missing indexes cause the DB servers to do lookup for each and every row that matches the criteria. The bigger the table, the bigger the hit on the DB servers.
  • Using OR and LIKE clauses: This also triggers entire table scans, meaning the bigger the table, the slower the execution.
  • Using functions on column name: This ignores indexes that have been created on that column.

unoptimized data model

Note that the OutSystems platform creates non-functional indexes, but on-premises customers have the flexibility to create function-based indexes in their databases.

Recommendation: Knowing which queries take a longer time to run, it’s possible to index the fields used on the WHERE and JOIN clauses, paying attention to the order of those when doing so.

For on-premise customers, it’s also possible (with access to the database) to analyze the query execution plan to identify possible indexes which would speed the query processing time.

Do not use functions on the column names inside Aggregates or Advanced SQL as they will ignore any indexes that exist.

Whenever possible, do not use OR or LIKE inside queries as they cause full table scans.

Huge datasets

It is a common practice not to archive data into secondary storage. Over time, the data grows and performance issues arise.

The rule of thumb is to have minimal data inside your primary OutSystems tables to improve performance of data manipulation and retrieval.

mechanisms of archiving

Recommendation: Implement mechanisms of archiving in your application such as explained here. If there are no criteria defined, you will need to come up with one and move data from primary to secondary storage.

Purge data that has been archived to improve performance. Archiving without purging provides no benefit.

The troubleshooting toolkit: Available tools

Effective analysis requires specialized tools, ranging from platform monitoring dashboards to standard browser developer tools.

AI Mentor Studio dashboard

AI-based analysis performed by the tool points out issues pertaining to Architecture, Security and Performance which play a crucial part in the runtime of the applications.

The tool provides a number of warnings specific to Performance that can hinder the scalability of the infrastructure and user experiences.

Check out the documentation for Mentor Studio for all the details.

Service Center logs

The OutSystems platform captures runtime metrics of every environment that exists for the customer. These logs come handy when troubleshooting performance issues that may arise in the following areas:

  • Errors
  • General Logs (example slow SQLs)
  • Integration Logs (for APIs consumed and exposed)
  • Extension Log
  • Service Action Logs
  • Timers & Processes

The amount of details that are logged depends on the logging level configured for the module. Log level is set to Default for every module. Changing the log levels can have an impact on the performance as more details need to be computed and logged.

  • Default: Logs basic information (date, time, action name, user, and duration) but no headers or payloads.
  • Troubleshoot: Logs error details.
  • Full: Logs request headers and payloads.

Chrome DevTools

The Network and Performance tabs in Chrome DevTools are especially useful for understanding performance issues.

These tools can be used to detect performance issues when loading resources and also how applications can behave under underperforming CPUs and bandwidths.

Conclusion

Even though performance issues are common and need to be addressed, there are tools provided in OutSystems along with best practices that can prevent them from reoccurring. Upscaling infrastructures can be a temporary solution but most of the time, the issue lies in your codes which need to be detected using the tools described above and fixed. Quick wins should be prioritized over fixing everything in one go as those can have detrimental effects.

Read the OutSystems documentation for troubleshooting app performance to learn more about what the platform offers.