As mentioned in earlier post, data consumption of the PVA also plays important role in the licensing of the PVA.
We also need to consider the data storage requirements for the PVA implementations. How to calculate the data consumed by PVA so that we can arrive at the needed storage. With usual configuration, PVA has special storage requirements for all the conversations of the chatbot and any attachments uploaded during the conversation.
Lets look at the tables which store conversation transcripts, All the conversations in the PVA bot are stored in Conversationtranscript table. Hence to forecast the data storage we need to understand size of each conversation and expected conversations. Again it depends in the complexity of the conversations,
We can refer any existing implementation and check size of ConversationTranscript table and count of conversation (which is ideally the number of records in this table 😀. If its our first implementation we have to assume some number and then validate it during the operations.
Generally we have forecast the sizing and license cost for one year, we should multiply above size per conversation with number of conversations expected per year (# of conversations *12* # of conversations per month) . Once we get this number, we can calculate the forecasted size of data verse needed for this implementation.
In addition to Conversation transcripts, let us also check the requirements in case we are expecting user to upload attachments or in case any plugin trace logs , audit logs are required to be to tracked. We have to consider that data consumption as well while calculating the data verse size.
Based on all above calculations we have to arrive at forecasted size and purchase the licenses accordingly.
While calculating the size, also consider the data retention policy as per the requirement. By default there is one bulk deletion job which runs everyday and deletes the conversation transcripts older than a month. This optimizes the storage requirements. If the bot is for internal users and data is not needed , we can keep this job on with the change in the filter setting. However in case the bot is for the customers and there might be policies related to customer data retention for audit purposes, we have to disable this bulk deletion job. In this case we need to consider the storage requirements on higher side. To optimize the costs, we can even think of the usage of data lake to store this history data. Solution can be implemented to archive the history conversations to economical storages rather than using data verse storage