Leading eDiscovery experts from KLDiscovery, Nyah King and Eric Robinson, join Counterfactual host Rebecca Wagner of Torys LLP to share important regulatory eDiscovery insights and provide a “sneak peek” at the hotly anticipated next wave of impactful technological advancements.
Join us for an exciting dive into the world of regulatory eDiscovery. In this episode of the Counterfactual podcast, we unpack key questions, including where to look for documents, how best to collect them, and how to choose the right technology for your review. Nyah King and Eric Robinson of KLDiscovery, two experts in the field, share their insights on a variety of facets of regulatory document productions. Nyah and Eric also fill us in on new technological advancements that may soon further shake up the eDiscovery landscape.
00:28
Hello and welcome to Counterfactual, the podcast produced by the Competition Law and Foreign Investment Review section of the Canadian Bar Association. My name is Rebecca Wagner. I am a Senior Associate in the competition and Foreign Investment Review Group at Torys LLP in Toronto. And I am very pleased to host the next installment of the Counterfactual podcast. In this episode, I'll be speaking with Nyah King and Eric Robinson of KLD discovery about some of the essential ins and outs of eDiscovery as it pertains to competition law. We will also touch on the evolution of eDiscovery including the impact of recent technological advancements. But before we get started in earnest a few words about our guests. Nyah is the Global Director of Legal technologies project management at KLD. Nyah leads KLD regulatory and investigation services team responsible for consulting clients through the eDiscovery lifecycle. Nyah and her team specialize in guiding clients to successful compliance with regulatory requests from global competition agencies. Eric is KLD’s Vice President of Global Advisory Services and strategic clients solutions. He leads KLD’s advisory services team and co-leads KLD’s cyber incident response group. Eric works consultatively with clients to develop and implement cost effective, efficient and defensible discovery, data and information governance and data privacy compliance strategies. He's regularly serves as an expert witness and is frequently sought out to participate in thought leadership events. Nyah and Eric, thank you for joining us. While there is much to dive into under the topic of eDiscovery, perhaps the most prudent place to start is outlining what eDiscovery is, and how it intersects with competition law.
02:17
Yeah, so first of all, let me just say, thank you for having us. Well, you know, it's an honor and a pleasure to be here. And, you know, having worked with you to kind of develop, you know, the content for today, you know, with respect to what is eDiscovery. And, you know, as someone who's been in this industry for 25 plus years, you know, it's something that has evolved over time, at its core eDiscovery is the use and of electronic means to manage data in the discovery process. So what does that mean? You're, you're thinking about the, what I will call the old or golden years of discovery and litigation management, where we focus strictly on paper documents, you know, in the, in the mid 2000s. And then kind of into the 2015 timeframe, we saw an explosion of electronic data, email, you know, other types of electronic documents, you know, being stored and the volumes of data just exploding. And in order to manage that, you know, these technologies, you know, these electronic review platforms, data management platforms, grew up to help attorneys better manage the documents in the discovery process, being able to do things like run searches, identify key documents, being able to address data sources, like email, versus just a Word document, or for those of us that are old enough to remember programs like WordPerfect, you know, all kinds of different elements. And then in today's world, we go far beyond traditional documents, like word processing, or even excel spreadsheets into collaboration platforms like Microsoft Teams, Slack, Google workspace, CRM platforms, you know, there's an entire spectrum of data sources that are now part of what we call the electronic discovery world. And with that electronic discovery today, is the leveraging of technology to manage these data sources to most effectively and appropriately, not only identify documents, but be able to very efficiently and defencably, identify those documents that are most likely relevant for any given scenario. That's a kind of a snippet. You know, I could talk an hour on just that topic, but I'll, I'll stop there.
04:54
And I think the only thing that I would add to that is largely eDiscovery is going to be the same regardless of the type of matter whether it's litigation or transactional, eDiscovery. Really, you know, the collection the processing is largely the same, where strategy may change right in terms of review and in production format. Those may vary, but eDiscovery in general, is largely the same across all types of matters.
05:26
Protocols may vary slightly based upon whether it's a regulator or a commercial litigation. But yes, I mean, we start to talk about specifically here today, things like SIRs and you know, you know, plenty of section 11 orders, those section 11 orders, that type of thing. You know, there are specific things you may want to do more focused in certain types of matters. But, you know, generally speaking, you know, things like defensible collection of data, you know, how you’re reviewing data, preserving data, all of those things are? What some people don't like to say it this way, but fairly standard across the board.
06:05
Only to true. Let's dive in further. What are some of the important initial steps at the outset of the eDiscovery process? Is how you start important to the ultimate production?
06:15
Yes, absolutely. So how you begin is not only critical to the ultimate production, but to every step along the way. I can't stress enough the importance of proper planning and thorough execution. So generally, you know, the first step is going to be determining your scope, who are your key players that we need to get data from? What are the data sources that we're obligated to collect, and what systems are in use that we'll need to collect from and at the onset of every matter counsel, and the eDiscovery provider should have an IT interview call with the end client, this call is really going to help the eDiscovery provider understand the different systems and platforms that are in use, you're going to have different tools that are required to collect from those sources. And it's important for everybody to be on the same page in terms of what we're collecting, what the best approach is at doing the collection. Custodian interviews really at the same time, is that it interview, very important upfront, really get a handle on who are the custodians were collecting from? Where is their relevant data? What types of data? Tracking all of this information at a custodial basis, very granular tracking, and really having an eDiscovery provider have a constant tab on who are we collecting? What are we collecting to do one comprehensive sweep? That comprehensive tracking and sweep is really going to minimize the need to go into redundant sources, you're going to have custodians overlapping and identifying the same sources. So really planning that at the outset, is going to make things much more efficient. And then same too later, you know, it's not uncommon that you'll go through this process, and a custodian is going to be named later down the road, it's okay to go back to the sources and collect but you'll you'll still have your tracking of have we already collected this, no need to do it again. And then I think to the other thing is keeping an open line of communication right between counsel, your eDiscovery provider and your IT contact at the end client, that is going to be critical, you're not going to have a single conversation and be done. It's very likely that during the course of the custodian interviews, you're going to learn additional systems or data sources that weren't covered in that it interview. And having that open line of communication, that's going to give your eDiscovery provider the opportunity to continue discussions with IT, and determine the best course of action for that collection. Lastly, early strategizing on your review methodology is going to be key to success too. Right. So what are you doing in terms of review? What are the necessary workflows? Are you going to be leveraging predictive coding, email, threading, audio transcription, foreign language translation, knowing the different tools that you're going to be leveraging and the workflows that are going to be necessary will really help set the stage.
09:42
Yeah, and just to add, you know, kind of layering in on top of the kind of the, you know, those operational components that that that Nyah was talking about, is it's really important, you know, in all types of matters, but particularly in in these regulatory matters, where the stakes may be higher is to really understand (a) What are your regulatory requirements? Right. Understand the idiosyncrasies of the particular agency or even a particular individual regulator that that may be involved. But then also to really understand and this becomes very critical for counsel, as well, as you know, the eDiscovery partner, is to understand any of the cultural, let's call them idiosyncrasies of the actual end client. I can share, I could share war stories, if we had, again, a lot more time than the hour, we have today to talk about some things where clients have gotten themselves into trouble, because, you know, they're trying to cut corners, or they were short sighted in, in planning, not to any fault necessarily of counsel. But because of not really comprehending or understanding the full breadth of their responsibility. And having as a, as an expert witness having to come in, you know, at the 11th hour to try and defend a particular decision, you know, that is often very challenging, and it often becomes a risk mitigation effort versus a kind of, you know, we actually did this, right. So, as part of that front end strategic process, there's really, it's really critical to understand the tactical requirements, as well as the strategic requirements for niceties, you know, with respect to each one of these individual matters, because there is no cookie cutter approach. Even for an organization that may have had multiple SIRs, or multiple investigations, whatever it may be, is each matter is going to have its own unique aspect. So ensuring those things is important. Nyah mentioned custodian interviews. You know, sometimes with custodian interviews or generally the custodian interviews, you have to take into consideration the individual. And there are certain individuals and certain organizations that may be more challenging either based upon their role or where they're physically located, to actually ensure that you're getting the appropriate information or data from the meeting. So that all becomes a key element in your strategic planning. And that that component of what I refer to as the three legged stool between counsel, the end client and the end, the eDiscovery provider, is really critical. Those open and candid. collaborative conversations is really what drives success. And these processes
12:41
Start well to end well, basically. Measure twice to cut once.
12:46
You just very succinctly said what I said in a very long winded way. So yes.
12:50
I’m sure our listeners really appreciate it. But it's certainly been my experience. It's really where these adages which come from somewhere truly have applicability. So next, how similar are document productions to litigation discovery, often times, document productions for regulatory requests can be described as akin to discovery and litigation. But is that entirely accurate? What are some of the key differences between a regulatory document production and litigation discovery?
13:18
Sure. So I think one very key distinction is timing. A regulatory document production will have much more expedited timelines. And regulatory matters really need to comply in a quicker timeframe, right, often as quick as 30 days. Whereas a typical litigation that may span months or sometimes years. So because of the time constraints and speed is of the essence, most regulatory matters will leverage predictive coding to help hone in on those most responsive documents for the immediate production. If not predictive coding, then it's certain with today's data volumes, that search terms are going to be necessary. It's not like a litigation matter where you're really reviewing eyes on more documents than you typically would in a regulatory matter. And I think another thing is having discussions with the regulators about the methodologies that you're going to employ. So for example, failing to disclose terms and negotiate as necessary with the regulators could ultimately mean that you've overlooked a universe of documents that the regulator is interested in. Worst case scenario, that could be a failure to comply. And in a best case scenario, it means you're going to have to run more terms, do more review, produce more documents. And even in that best case scenario, it likely means you may need to go back and do a refresh collection of all of the other data sources, just for timing, you know, the timing in which documents are collected in accordance with the compliance productions. Eric, anything to add to that?
15:00
Yeah. So I mean, I think the other thing that from a practical perspective is is there are some, and some people may consider this controversial, but in a regulatory production and review process, the primary focus and the key focus and Nyah, you kind of jump, jump in with it, you know, if you feel differently, is on ensuring that you're identifying for the privileged documents to be withheld. And you're, you're not as concerned with what I would call overproduction of non-relevant documents, right? Because of the expedited timelines, you're going to, you're going to, you want to preserve privilege to the extent possible, but you're likely going to overproduce on the non-privileged documents side, again, because of timelines, cost, all those kinds of fun things, as opposed to a litigation matter, where you're from a general perspective strategically, you want to give your adversarial, you know, the adversarial party, as little information as possible, that they could then potentially use. So it's a different mindset, to some extent, but it's a slightly different strategic approach from that perspective. And I think that's one of the big things, and one of the challenges just, and it's common between litigation and the regulatory piece is how technologically or E discovery savvy is not is the opposing party or, or the regulator, because that determination, may play a significant role in how much explaining or justifying needs to be done in order to leverage a particular protocol.
16:59
The same but different. Indeed, I find, especially around using search terms, as you mentioned, Nyah predictive coding is basically a must at this point, especially with the unreliability of search terms, and certainly at the collection stage. Onwards now to a central element of eDiscovery. Technology at the heart of eDiscovery is the technology that processes and analyzes the data. Technological advances in this area seem to occur regularly, at least to me, what are some of the key advancements in this space of the last few years? And is there one form of technology that is superior to the others? Or is there a time and a place for each?
17:39
So I'm a firm believer that there's no one right answer ever. So well…
17:47
Life lessons in addition to eDiscovery lessons.
17:50
Yeah, as a lawyer, I should know better than to use absolutes. But as a general rule, I mean, there is a time and a place for all solutions. So I think that this is this is where we, we will consistently during this conversation, come back to, you know, developing a strategy that fits the needs of the particular matter, whether it's regulatory, or litigation, we're focused today on the regulatory pieces. But that said, given and we've touched on it, and Nyah done a really good job of pointing them out, is that with the data volumes that are in play in in today's world, and this has been for all, for those of us that have been kind of eyeball deep in this over the last several years, the increasing volume of data has really pushed for the greater adoption of technology to facilitate a more effective and efficient, you know, identification and review process. So if we go back 10 years, well, let's go back 15 years, 15 years ago, I mean, you were lucky if you had OCR, and an electronic text extraction that did a decent job of actually allowing you to run searches. But the ability to run searches a lot, you know, across large volumes of data was a sea change in, you know, in the discovery process, because prior to that, you had people that were literally sitting at tables, in basements running through pages of paper files, trying to figure out what might be potentially relevant. You know, trying to do that today, people would, you know, most people would just run away. You know, with the advent of it, you kind of this changing landscape of technology. We went from, from running basic searches to being able to leverage things like email threading, you know, where you can identify the most inclusive email in in, you know, in a thread in a thread of email communication, so that instead of if you've got 10 people on an email chain, instead of having to review every component email for each custodian as it appeared, you now can identify that most inclusive email, identify all the individuals that have been associated with that communication thread, and review that string one time. And then, you know, in any platform worth its salt, you'll be able to then you promulgate, or share the coding from that email thread across the component emails that exist in your data set. So we can talk about production in different formats or production, whether you're producing at the at the inclusive email level, or at the component email level. This is another place where regulators may differ. Some may require producing at the component email level, but it doesn't prevent you from reviewing at the email thread level for efficiency purposes. We then move into kind of more advanced analytics as it's as it as it evolved. I mean, for us, we introduced our predictive analytics tools almost 14 years ago, and have continued to evolve that over time, the industry has continued to evolve in that arena as well. So predictive coding takes a number of different forms. Again, we could spend an hour talking about just about predictive analytics. But at a high level, you're basically talking about two primary models. Some people use the term tar, or tar, 1.0, tar, 2.0, tar standing for Technology Assisted Review. Because I'm old, and I've been doing this a very long time. You know, I fall in the camp that says technology assisted review is a broad umbrella that encompasses a lot of different technologies. A lot of people use that terminology today to speak specifically around predictive coding tools. Where in actuality, predictive coding is just one tool under that umbrella. But for purposes of what has become basically the Kleenex of terms for predictive coding, we'll use the tar term as much as it kind of puts a knife in my heart.
22:13
The connotative and denotative dichotomy.
22:18
That's right. But we talk about tar, 1.0, and tar 2.0, tar 1.0 is what we would more accurately call your simple active learning or, SAL. And that is where you have a subject matter expert, that act or one or two subject matter experts that review training sets of documents to then train a predictive model, that then will score the rest of your dataset. Tor 2.0, it refers to what we call continuous active learning or CAL, which is a model where your predictive analysis starts with document one across your entire review team. And then as your review team is, is training documents, the system is learning. And there are very appropriate use cases for both. In traditional litigation, yeah, most often, probably 95% of the time, you're going to be using the continuous active learning model. It is by far the most common. We still see from the regulatory perspective. And Nyah can talk more specifically about kind of how this is being done day to day in the matters that she's she and her team are managing. But, you know, in the regulatory perspective, we're still seeing the use of the SAL model or the TAR 1.0 model, simply because it is going to provide a faster jumpstart to that learning process. And because of the expedited timelines, you're under the regulatory, basically, the regulatory production schemes, it actually produces a more efficient result, not necessarily a better result. It's just a different model. And then we can not to confuse matters, but some people might refer to TAR 3.0, which is basically a hybrid type of model, where you might have a small team of subject matter experts using continuous active learning to start the review process to jump to jumpstart a continuous active learning approach. We're seeing this gain some traction and both on the regulatory side but it's still not a not predominant, but that is a third approach to leveraging predictive analytics. We can also start to talk about different tools like natural language processing, your other tools that are actually being able to do more contextual analysis of the data to help review teams more effectively, efficiently and expeditiously identify the relevant documents in the data set, you know, and we continue to evolve. I would be delinquent if I didn't mention the of kind of the evolving state of technology. And you know, what's at the top of your virtually everyone's mind when they talk about AI. You know, everything we've just talked about, is technically artificial intelligence. You know, anytime you're having the computer do something that human would otherwise do. That's technically by definition AI. Where we're starting to see a new evolution is in the use of generative AI, we're not going to get a lot to kind of deep dive into that today, because it's still a very evolving area. Very limited, very limited use today. We'll see that change. You know, I think fairly quickly over the next few years. I think we may address that and kind of one of our upcoming questions, but, but that is kind of where things may be going. But in today's world, the email threading, natural language processing, predictive analytics, near duplicate detection and analysis, are some of the key or key areas where we're seeing technology and analytics being used to kind of expedite video and provide efficiency, and , candidly, greater accuracy in these processes. Nyah, thoughts?
26:27
Yeah, I think that, you know, one point that I'd like to add is that while having all that technology at your disposal is important, I would suggest that it's more important that you have an experienced provider who knows how to most effectively use that technology. Products are going to differ, everyone want to claim that they have the best technology. And certainly I think, you know, you'll find some tools that you feel are more superior than others. But what is really important to it is having a provider who can consult with you on all the tools that they have in their wheelhouse, and how to most effectively use those tools throughout the project. And I think that would be the one addition to the technology itself that I would add, and yeah.
27:14
I was just curious if there are, you know, tips you could provide for how to use the technology well, to our audience.
27:21
Yeah, so I mean, I think one of the key elements here is that your technology is just one piece of this puzzle. And it doesn't matter how good the technology is, if you don't (A) have, you know, have individuals who know how to leverage the technology as Naya was referring to, you know, or (B) the appropriate validation protocols in place to validate what the machine is actually giving you and what you intend to produce. Then, without those two things, you're going to be really challenged to defend any results, right. So knowledgeable users and providers and support in the process. And then validation of the protocols that are in place. In today's world there has yet to be any litigation around these predictive technologies. There's not been a single, a single legal opinion, or court opinion on the legal technologies themselves. What has been challenged, are the protocols that have been used in leveraging these technologies and whether those protocols were complete and or defensible. And that's where, again, we talk, we will continue to go back, taking the time early on to develop a sound strategic approach is really critical to these processes.
28:49
And on the subject of some of the tools does eDiscovery now, or will eDiscovery tools in the near future be able to incorporate AI technology like large language models and real language searches as part of its toolbox of offerings?
29:04
The simple answer is yes. When you know, when that actually becomes mainstream, remains to be seen. I mean, I can put on my Nostradamus hat. And, you know, I know what my personal thoughts are. I mean, we've already seen some technologies beginning to make a splash. Candidly, I think a lot of the technologies that are that are being marketed are not there yet. There are some that that are showing promising results. But at the end of the day, unlike with predictive coding that really took us almost kind of eight, eight to 10 years to get widespread adoption. It grew to Adoption grew steadily. But we were really eight to 10 years in and before we actually became before it was actually considered almost as almost unethical not to leverage it. You know, with this technology with the with the generative AI large language models, you know, my personal opinion is, is that within the next 12 to 18 months, we're going to start to see these technologies playing a significant role in these in these review processes. You know, we all know that, you know, generally speaking, attorneys are slow to change, and the legal profession is slow to change. I think that's going to change for a number of reasons. One, the end clients are going to mandate that the that their legal that their legal teams leverage these efficiencies, as a as a cost savings. But I also think, and this may be, you know, it may be a touchy subject, but I think, because we're starting to see a generational shift, and in leadership in the, in the law firm environment, in corporate legal departments, and even within the regulator within the regulator's themselves, you know, we're seeing individuals who actually grew up with technology, who are more comfortable with technology, in positions in decision making positions. So the idea and concept around leveraging technology is, is more readily accepted. What is going to what is going to kind of be the precursor to the kind of the, basically breaking through the ceiling on this is proving that we've got validation tools in place, that you know, that these technologies, we can validate the results, we can authenticate the results. And that in that case, then we'll probably end up seeing a judicial opinion that says, yes, yeah, this process or protocol? Yeah, it is. Yeah, it is appropriate. But that that's my soapbox on that the moment.
32:03
And I know you alluded this to this before, Eric, beyond that. What other tools are on the horizon? And how will this change the discovery landscape? I know that just seems to be that there's always something coming out a new each time I turn to a production.
32:19
Yeah. So yeah, we start to think about the generative AI tools, which and the large language models, right, I mean, basically, what you're looking at is an environment where we can ask technology, a natural language question, and have it interrogate a universe of data and say, here's, here's what is relevant to what you asked me, right? Which is very different than having to create a Boolean search that here's you here, here are 30 search terms that I need connected in a very specific way. And that technology is, you know, is very rote, right? It's going to say, here's what I've asked you for is going to give you exactly what you asked for. One of the things that natural language processing, and the generative AI tools do is dive more into the contextual conceptual analysis of the data. So if I say, you know, I need to see every document where, where there's a reddish cog discussed, then I may get a document that says that there was a ball bearing and this tool over here that had a red hue to it, you know, or I may say that, you know, that team red developed this, this mechanism. So very different components, right, that's very general, and probably not the best example in the world. But, you know, as, as we kind of go through this, you know, what's going to happen with these technologies is, you're going to see where, again, my opinion, and no one else's, unless they choose to agree with me, it is that you're going to see a significant change in the managed review process. We're going to be a significant change in how you know associates you know, at law firms and in corporate legal departments are dealing with in addressing data. We're going to see a significant change in where in the process these tools start to be employed. Simply because, you know, let's, let's face it, if you look at most corporate environments today, most environment you know, I think what 70% of corporations are using Microsoft 365 in some way, shape or form. Right. As Microsoft continues to enhance their abilities to do certain things in that platform, you know, I can again share a lot of horror stories where, where clients have been overconfident in the ability of what they can actually do in these platforms, not just Microsoft, by the way. But, you know, we're going to continue to see where the corporate client, the end client is going to be doing more on the front end to narrow datasets as they come out, leveraging, you know, some form of AI in this process. As, as the data moves downstream, we're going to see your providers and counsel in collaboration, leveraging tools, to more effectively conduct early data, early case assessment on these files to significantly narrow datasets before they go into the review process. And then leveraging, you know, you know, even more advanced AI, to then within the review process, really focus on the you know, being able to identify those critical documents in a data set. Can I tell you what, specifically those tools are going to be? No, because they haven't been created yet.
36:02
But you feel it in your bones that they're coming?
36:04
Oh, yeah, they're coming.
36:06
Well I agree with you. There's at least one. And I'm definitely I'm sure others are looking for it, especially clients to advancements where the less documents less data is being transferred. And more relevant stuff gets transferred for processing. But Nyah speaking of trends, or things that aren't, you know, technology, or how this the advancements already have impacted. What regulators are doing. Have you seen some emerging areas of focus that weren't there previously?
36:28
Yeah, I think one common trend we see now is the expectation of collecting and producing, you know, text messages, chats, modern attachments, these things that were a struggle maybe a few years ago, but now are they're not considered new data types, right, there's no reason that we shouldn't be able to accommodate the collection and processing of these data types. So it's definitely important that you have a provider who is able to collect all the new data types, or recently newish data types. Not only that they can collect it, but that they can convert it and process it in a manner that's going to allow for this data to be produced to the regulator in a manner where they can conduct a meaningful review. So what that means, for example, is the ability to thread text messages and chats from teams or slack. Regulators are typically asking for these communications in meaningful chunks, that's going to offer enough context for them to conduct a review. So the general rule here that we see is 24 hours, right? If there's a responsive text or chat, produce that whole thread for a 24-hour period. We also know that there's more data today and that the data volumes grow year after year. And so with this exponential growth specific to these types of communications, the teams, slack, these volumes are often so high now that it's really difficult to do a document by document review, and still comply with those timing goals that we see in regulatory matters. So what we have seen, you know, at a global level really, is the acceptance of different methods that will help manage the review and production of those relevant items. So for example, we have seen instances where these types of communications say the project is using a predictive coding a TAR 1.0.. For purposes of this discussion. We've seen instances with Slack volumes, so high that regulators have agreed to apply search terms just to that data source. It meant, you know, providing a data dictionary and going through the whole process of search term negotiations. But it really allowed us to still meet those timing goals and comply with a quick compliance. We've also recently seen regulators allow for short message format of these types of communications, consolidating those threads into 24 hour chunks into a single document, and even in some instances, allowing those to go into a predictive coding TAR 1.0 model for purposes of identification of responsiveness So I think, you know, production specifications do tend to consider these communications now more than ever, even when the requirements are baked into the production guidelines, regulators often come back and ask that you're providing information within the production that's going to help them, you know, review these documents, if you're not producing them as a single document combined, where it's a comprehensive thread, but individual chats, there's an obligation or an expectation that you're going to be producing the thread information so they can put these threads together and really review this type of data in a in a meaningful way. I think that's probably the biggest trend that I would call out.
40:23
Yeah, I want to add just one thing, I mean, in particular to it, we've talked about data volumes being being an issue writing and exploiting data volumes. Slack is one of those areas, which is a, you know, for, particularly during and immediately post COVID Era, that slack group, slack as a platform grew by almost think this metric was by 8,000%. You know, from a user perspective, subscription perspective, we've actually seen that slowdown, but one of the key elements, and this is often something that clients end up getting surprised by is when you collect data from slack, and we talk about mountains of data, is if you go to Slack and say, okay, I'm going to do an export for these, for these 20 custodians. And it comes out and says, it's going to be 500 gigabytes of data. Let me use a better example. Say it's gonna be 50 gigabytes of data for these 20 custodians. We have seen, where once you start to associate attachments and other documents, which are not, you have to then collect separately in a in a kind of your kind of add on process. Once you identify the messages, it's not uncommon for your Slack data to explode by 10 to 30% 10 to 30 times. So 50 gigabytes can become 500 gigabytes very easily. And we see that as a as a day to day challenge. And we spend a lot of time talking to clients about that possibility, because you can't predict it until you actually get the data out of the environment. So, when we start to talk about strategic approaches, and things like that, and these platforms that are designed to create efficiencies from a business purpose, yeah, they create complications from the eDiscovery perspective. And these are things that that are important for, for all the all the participants in the process to understand, which leads to things like what Nyah is referring to, and being able to go to a regulator and to negotiate and say, We just mean we had no way to know. And now this is where we're at, you know, kind of where you work with us.
42:49
Oh, indeed, for sure. And, and I'd say from my perspective, it's what we're advising is it's what's private, certainly not private anymore. And a lot of employees are getting more accustomed to in emails being more mindful and then thinking that these other platforms are free forum. And in essence, of course, PII is another hurdle to consider in these in these circumstances. As you mentioned, Eric, you know, usually we look a lot for privilege and the main production, but now it's, it's both in these platforms.
43:19
To your point, Rebecca is I mean, I mentioned kind of the generational shifts that we see. You know, when I started doing this work, 25 years ago, we were still doing you typed memos and mailing things and things like that, and then we evolved into an email culture. But again, as we, as individuals that have grown up with technology, and different forms of communication, are moving into key roles, leadership roles, and more significant roles in organizations, we're seeing a significant shift in how communications are being shared. So, you know, it used to be that email was considered a non formal form of communication. Well, today, I mean, how often do you actually send a letter? I mean, I know you're a lawyer you send letters all the time but, you know, a kind of the typical individual, if they communicate, they're looking to communicate with someone they're sending an email if they're trying to be formal. But generationally, we're seeing a shift where there used to be text messaging or teams chats where we it's called ephemeral for a reason. That year, it was just basically disposable. But these forms of communication through these chat platforms and collaboration platforms are starting to become the primary forms of business communication. Not just within organizations, but for organizations communicating out to third parties. So this is another area of technological shift that we're having to accommodate from the discovery perspective, right? And how you manage these data sources across your across the spectrum.
45:02
And collecting from them is not like collecting from email.
45:06
We need to separate a whole other podcast, just on that.
45:09
To true, to true. And I'm certainly here, we do want to get to this next topic, which is central, certainly to all of us who do these things, which are costs and time. And I think it would, it's incumbent upon me as a host for this topic to make sure we touch on this and touch on what it is that affects efficiency and costs the most in reviews so that our listeners can take some lessons from that.
45:35
Yeah. So I think there's many factors are going to impact cost and time, right, as the number of custodian grows, your data volumes are growing. So to then your review demands and your production volumes. So the best approach to trying to manage costs and time goes back to having experienced key players on the project team who know how to leverage all the tools and the methodologies to the benefit of the successful compliance. So that means, you know, first and foremost having an experienced eDiscovery attorney from outside counsel who understands the overall flow of how things should happen. It means having a vendor who understands the changing landscape of all of the technology, having a nimble provider who has tools that can quickly and effectively collect, convert process all of the relevant data sources. Having an experienced project management team at your eDiscovery provider is critical, particularly with matters like regulatory Cyber Incident Response and multi district litigations, right. That project manager is generally going to be the one from your provider who's consulting with you on the right tools to use for your specific matter. They're also going to be the ones that are skilled with how to most effectively get the documents that need to be reviewed to the review team, and how to ultimately get those out to production in the right format. It's critical to have a highly skilled review manager leading the document review, they are going to know how to appropriately staff the review, they're going to understand what technologies specific to review should be employed within each of the various work streams. And they understand how to balance that review. When do they need to bring in more reviewers, should they shift some reviewers from one workflow to another. And then most importantly, these experienced individuals need to collectively work toward the same end goal, communication early, often and constantly outline the expectations, collectively ensuring all elements are considered and be certain that everybody is contributing the way they need to be contributing. And you will have a successful matter.
47:58
And I'm gonna go kind of like to the front end of this process and say, failing to plan is planning to fail. So taking a breath at the outset, and developing a strategic plan, you know, from for all phases is critical. Not that it's not going to change because in 25 years of doing this, I've never, I've never ever had a project that something didn't come up that changed either the scope or the nest or the necessary protocols. Yeah, so it's critical one, any strategy you have has to be dynamic, and the partners you're working with, through that strategy have to be as Nyah said, have to be nimble enough to be able to shift with the shifting requirements. But having that front upfront strategy, or at least an outline for how you think you need to approach it is critical to ensuring that all of those downstream operational components are in place and are appropriate to the needs of the of the particular matter. So that's just my kind of added two cents to all of the usually valuable stuff that Nyah was talking about from the operational side.
49:17
Those are all excellent points, I think from my perspective as well, as you mentioned, both of you that that communication between the various parties, I think it's also the speed of that of that communication as well. When questions are asked, it's really important that they get answered as quickly as possible because one thing builds off on other. And so just like so that's what we, as the individuals will need to keep in mind in terms of how to plan the process. But what about companies themselves I think, Eric, you were talking about this? And I know from previous conversations, you're quite passionate about this. But are there things companies can be doing, knowing that it's possible given the prevalence of these document, regulatory requests and production orders as they're coming out? And the prevalence of, again, large data and every time there's more and more data, it seems and of course, every gigabyte of data is more money, and potentially more time. So are there things that companies can be doing to set themselves up in a better position to handle these regulatory protections?
50:18
I suppose just saying, yes. It's not a sufficient answer.
50:22
Oh, well, it certainly is. But I'm sure our audience would appreciate a bit more detail.
50:26
So again, if I leverage my experience having been in both the corporate environment, the law firm environment, and all the service provider side is, it's the same statement, if failing to plan is planning to fail. So organizations, particularly those, Every organization should have a data governance, data management and eDiscovery program in place. Some are going to be more sophisticated than others, those that don't have a lot of investigatory or litigation events, probably aren't going to have as robust or as detailed a program that others may have. But they should, yeah, but there's no company that's immune from either a regulatory investigation or litigation, right. So everyone needs to have something for those organizations that are, you know, have a consistent or even large litigation portfolio, or are regularly pulled into regulatory matters, or even internal investigations, right, or any type of investigation, having a defined process in place to, to follow the eDiscovery reference model or the EDRM. You know, which, which will allow you to define what are what are my, you know, my data identification and preservation protocols? How am I then collecting that data? You know, do I have a protocol in place for like, who am I outside counsel for certain types of matters, different firms have different specialties? Who is my preferred discovery provider, being able to have those types of things in place is hugely beneficial. But on top of that, is the data management piece. So understanding what data in your in your organization, what, what data do you have? What systems does that data live in, you know, when we're consulting with clients on the legal operations and discovery management and data management information governance side, we talked to them about developing data and systems inventories. And you can do it at the enterprise level. And you can do it based on the kind of discovery and the discovery level for those organizations that are regularly having to run through this process. But understanding what data you have where is creates significant efficiencies in this process, and also allows you in, in, in general, to not only control costs, but be more defensible, because the likelihood of missing something is much less. So those are key components is understanding that and understanding if you've got proprietary systems or unique systems that are not necessarily mainstream, is not only what systems do you have, but how do you get the data out of those things, for discovery purposes, because there are some platforms out there that are really challenging. If you're dealing with structured data, if you're dealing with you know, financial accounting systems, SAP, different systems, where you can get, you can get a SQL database out, but then what are you going to do with it? These are all things that are part of a strategic initiative to plan for how you deal with these things, you know, across the spectrum of the of the litigation or discovery environment.
53:40
Interesting indeed. And I love the passion on the topic.
53:44
I’m trying to contain myself.
53:46
I can see that. Well, there are many more questions we would love to cover and I know that we could go on for hours. Likely many of the topics we touched on today could be the subject of their own podcast. But I think we've given our listeners enough to digest for the moment. Nyah and Eric, it's been a real pleasure to speak with you both. Thank you on behalf of the counterfactual podcast and our listeners for taking the time to share your knowledge and insights with us today.
54:10
Pleasure to be here. Thank you.
54:11
it's been an honor and a pleasure. Looking forward to engaging more in the future.