Bio-economy reviews

Friday, September 21, 2018

Keynote speech, First International Lignin Conference

I was honoured to be invited to present a keynote speech at this conference in Edmonton this week. An excellent event! I will write it up, but meanwhile here is a link to my presentation.

Monday, July 30, 2018

Adoption of disruptive technologies: Lessons from history, Part II

In Part I, I described how Studebaker successfully made the transition from being the largest maker of wagons in the US to a successful car manufacturer (at least until 1966). In Part II, I want to describe the Curtiss and Wright companies as they competed to develop the modern aviation business.

At the beginning of the 20th century, Glenn Curtiss operated a motorcycle factory in Hammondsport, NY, where a museum stands today in his memory. The museum is well worth a stop if you are in the area; allow at least three hours as the 60 minute documentary is well worth the time. It is also possible to visit the restoration shop out back where a WWII fighter is currently being rebuilt.

WWII fighter rescued from the bottom of a lake and undergoing restoration in the museum workshop.

Ask anyone who invented the airplane, and the answer will be the Wright brothers, Orville and Wilbur. Certainly they developed an understanding of forces acting on a wing, and how to control those forces, well beyond what any other person or company was capable of in the years from 1900 to their first flight in 1903 and the granting of their patent on control methods in 1906. They were excellent experimenters and craftsmen, and while the patent proved to be overly broad, it was well deserved.

Glenn Curtiss managed 136.7 mph on this 4.4 litre, V8 motorcycle in 1907. The record stood until 1930.

1907 Curtiss.

However, others were perhaps more adept at commercialisation. Glenn Curtiss was one such pioneer, an engineer but also what we would recognise today as an entrepreneur. Initially a builder of bicycles and motorcycles, he built more and more successful airplanes, winning the prestigious Gordon Bennett race in France in 1909. (The Wrights did not enter, but two independent pilots entered Wright planes; neither won any awards). By 1910 he was working in San Diego with the US Navy at launching and landing airplanes on aircraft decks.

In this same period, the Wrights continued to build prototypes and small numbers of commercial planes while engaging in a nasty patent battle with Curtiss. While the Wrights were careful experimenters, it would seem they were not moving as quickly as Curtiss in commercialising, relying instead on their patent and plenty of legal strong-arm tactics to maintain their advantage. The issue addressed by the Wright patent was the method of controlling the wing surface; the patent, which was broad, was interpreted by the courts as describing any control mechanism, not just the method the Wrights developed of warping the wing structure to change the wing profile. Curtiss came up with a method of changing wing characteristics using movable ailerons, which are much closer to modern flaps and other movable parts than the flexing of an entire wing structure. Unfortunately in 1914 a court ruled that the Wright patent covered any method of controlling flight surfaces, and Curtiss was forced to cease operations.

In 1916, Orville Wright retired and sold his interest in the patent to Wright-Martin Corporation, for approximately $1 million. (Wilbur had passed away in 1912). Eager to recoup the cost of the patent, Wright-Martin continued the legal battles. By 1916, the only new Wright plane was a single prototype, and while Curtiss was capable of producing saleable airplanes at close to commercial scale, he was effectively out of business. As a result airplane development in the US had stalled, and the US had no airplanes for the war effort. In 1917, the US government forced all aviation patent holders to share their patents in a pool, and to pay a membership fee to participate in the pool; most proceeds went to Wright and Curtiss, who also had a number of patents of his own. The goal was to get airplanes built, using best available technologies, and stop the legal squabbles. Wikipedia claims no American airplanes were used in WWI, but the Curtiss museum claims that US-made planes were indeed used in WWI, and that they were all built by Curtiss. In any case it is clear that most if not all airplanes used in WWI were of European manufacture, in many cases in violation of the Wright patent as interpreted by American judges.

Replica of the Curtiss America.

Replica of the Curtiss America, designed in 1914 to win a 10,000 pound sterling award offered by the London Daily Mail for the first trans-Atlantic crossing. The eruption of WWI prevented the attempt and the plane was sold to England for use as a submarine spotter. This replica was built in Hammondsport by volunteers and was first flown in 2008.

Curtiss America cockpit.

The patent pool was not meant to last beyond the end of the war, but manufacturers did not resume legal hostilities in 1918, so the patent war came to an end with the end of World War I. Ironically, the Wright and Curtiss companies merged in 1929 to form Curtiss-Wright, which still operates today. To his dying day, Wright felt that the Wright name should have come first.

So what can we learn from this? First, a patent that stifles innovation is a problem for all. The US aviation industry was held back by this nasty little feud, and no one really gained from it (except, perhaps, the lawyers). Second, patents can have unexpected side effects, especially in the case of new or emerging industries where the full importance of something may not be fully understood until later. Many technologies in use today work the way they do for reasons that have nothing to do with adoption of the 'best' technology, but because one side won a legal battle, or had better marketing or better licensing terms. (VHS versus Beta, anyone?). Third, the experimenter who does superb ground-breaking work needs to be recognised and encouraged, but the skills of the entrepreneur in getting things built quickly and efficiently is equally critical in getting a new industry up and onto its feet.

Oh, and the museum has lots more: wooden boats, motorcycles, cars, weapons and other bits from the dawn of the 20th century. Plus the town of Hammondsport is delightful. Drop in!

Adoption of disruptive technologies: lessons from history, Part I

I had the opportunity earlier this summer to visit a couple of museums focused on the early years of the automotive and aviation industries. It strikes me that there are some parallels with the new bioeconomy. In the spirit of the old saying that those who ignore history are doomed to repeat it, I offer some thoughts that might be relevant, based on a couple of case studies.

My first stop was at the Studebaker museum in South Bend, Indiana. If you know anything about Studebaker, you'll recall that production ended in South Bend in December 1963, and that the last car rolled off the assembly line in Hamilton, Ontario, in 1966.

The last Studebaker, produced March 17, 1966. This car is currently in the Studebaker museum.

The main reason was a failure to compete with the Big Three in the so-called pony car wars (the Mustang being the initial entrant and leader of the pony car class). Studebaker built solid, dependable and somewhat boring cars at a time when the American public wanted vavoom; the inability to build competitive vehicles was partly if not completely due to poor finances.

Innovative, but not a sales success: The Avanti. After Studebaker closed its doors, a consortium of Studebaker dealers purchased the Avanti name, spare parts and tooling, and continued hand-building cars in very small numbers. Since then, the company has had multiple owners, and has moved several times; the last car was made in Cancun, Mexico, in 2006. The then-current owner's arrest on fraud charges was certainly a contributing factor, but the niche automobile business is a very difficult one at the best of times, even without legal problems.

So why is a failed company like Studebaker a relevant case study for the bioeconomy? To answer we need to go back to the beginnings of the automobile, and look at the innovators. The big names include Henry Ford, Walter P. Chrysler, Louis Chevrolet, Gottlieb Daimler, Karl Benz and others. None had any background in the horse and buggy industry: Ford worked as an engineer with Edison Illuminating Company, who supported his early work; Chrysler and Benz worked for various railways; Chevrolet essentially worked exclusively for the automotive industry from a young age; Daimler and his colleague Maybach designed steam and gas engines from the 1870's. All founded new companies to develop and market their new inventions.

In contrast, two of the five Studebaker brothers, Clement and Henry Jr., set up shop in South Bend in 1852 to build horse-drawn carriages. The firm was incorporated in 1868. According to Wikipedia, half of the wagons used during the height of westward migration were Studebakers; they also supplied the Union forces with wagons during the US Civil War.

1857 (back) and 1919 (front) Studebaker buggies. The 1919 model is the last buggy built by Studebaker and went directly to the company's museum.

1910 Studebaker dump wagon, 1 ton capacity. Trap doors in the floor serve to empty the load. Presently in the Studebaker museum.

By 1885 they were exporting carriages around the world, and annual revenue was $2 million, a huge amount at the time when the sturdy dump wagon shown in the figure sold for $202.78; many of the majestic homes built by the brothers and their partners are still to be seen in beautiful South Bend neighborhoods.

In 1895, a younger generation of sons and son-in-laws, led by Frederick S. Fish, began asking whether the company should look at this newfangled self-propelled device called the automobile. The discussion was apparently heated, and an engineer was assigned to this only in 1897 after the passing of one of the remaining brothers who felt that wagon sales were strong and the company should not lose its focus. The first automobile, made in 1902, was electric; the company made $4 million selling wagons that year. Gasoline-powered cars, built in partnership with other automobile firms, came along in 1904, and the last electric was built in 1911. Early teething problems were sorted out using cash from the wagon business, which continued to be strong, but this was a declining market; by 1918 sales of automobiles had reached 100,000 units while horse drawn unit sales had dropped to 75,000 from 467,000 in 1911. Studebaker finally sold the wagon business, to a firm in Kentucky, in 1920, replacing the wagons with a line of commercial vehicles.

1937 Studebaker Coupe Express pickup truck. Presently in the Studebaker museum.

1933 Studebaker President. Presently in the Studebaker museum.

The company survived the Depression, partly due to its reputation for solid and dependable vehicles. It is, if not the only one, certainly one of the rare cases of a firm successfully making the transition from horse-drawn carriages to the automobiles. While the firm was unable to navigate the rapid changes in the post-war automotive market, the decision to adopt a disruptive technology in 1897 led to almost 70 more years of commercial activity. This contrasts with most other players in the early automotive industry, who were overwhelmingly outsiders, often railway engineers already comfortable with the concept of a self-propelled vehicle.

What can we learn from this? A few things. First, you need a successful business with decent cash flow to support a move to a new field. The Canadian pulp and paper industry, by and large, is finding the existing business challenging, and free cash flow is in short supply. Second, partnerships can be a good way of distributing risk, although they need to be selected carefully; Studebaker's partnerships with existing automotive firms were at best unsuccessful, and at worst almost destroyed the new business due to poor product quality. So picking a technology provider, rightfully, requires a painful amount of due diligence for the technology provider, if not for the sponsoring company. In my former life as a research manager, this was a difficult lesson to learn. Third, vision starts from the top; the transition at Studebaker only started in earnest once board members opposed to the change passed away or moved on.

In Part II of this post, I'll look at the effect of patent wars on the adoption of new technologies. Spoiler alert: it wasn't pretty.

References: The Studebaker museum in South Bend, Indiana; Wikipedia; http://www.stude100.com/stu/Pg1/pg1.php

Upcoming conferences in the forest biorefinery area

I see it has been ten months since I posted here. I spent a large portion of that time moving all the contents of my home into storage so I could move out for three months while the place was renovated; I'm now moved back in and things have settled down enough to allow me to pick up the biorefinery file once again. Time flies when you are having fun!

I am very excited about a couple of upcoming conferences in the forest biorefinery space.

The first is the PAPTAC International Lignin Conference, to be held in Edmonton the week of September 17, 2018. As a member of the program committee, I can say that the selection of proposed abstracts is of very high quality, and cutting it down to a three-day conference will be very challenging. If the existence of a wide range of papers from the academic, vendor and lignin producer worlds is any indication, we now have a robust ecosystem of players which bodes well for this new and growing industry.

This remarkable boot-strap effort has overcome, to a significant degree, the old chicken-and-egg problem: No one will build a plant to make lignin if there is no customer for the product; but no customer will sign any kind of a commitment to purchase from an as-yet unbuilt plant, which will make a product of unknown properties and price. Today we have three full-scale plants producing softwood kraft lignin, with more in the pipeline. This affords customers the opportunity to work with multiple suppliers; conversely there are multiple buyers, so no plant is at the mercy of a single customer. The story of how we got to here is a fascinating one that I hope to be able to address in a keynote speech.

Click here for conference details.

The second is the 8th Nordic Wood Biorefinery Conference, set for Helsinki the week of October 22. The last seven editions of this conference have all been excellent, partly due to the 18 month gap between editions, and I do not expect this year to be any different.

Click here for conference details.

See you there!

Wednesday, July 26, 2017

Triage process for selecting bio-economy projects, Part III

I posted two articles on triage processes back in May 2017. Since then I have come across a very interesting approach for looking at different molecules: plotting these molecules in terms of their oxygen mass fraction (or weight percent) versus their hydrogen mass fraction.

This concept is presented by Farmer and Macal in Figure 4.12 in Chapter 4, Platform Molecules, Introduction to Chemicals from Biomass, Second Edition, James Clark and Fabien Deswarte (editors), Wiley, 2015.

I've created my own so-called Farmer plot with a selection of molecules of interest to folks active in the bio-plastics and bio-chemicals fields:

You can see that the aromatics and straight-chain hydrocarbons all fall on the X-axis, as they have varying amounts of hydrogen but no oxygen. As they come directly from petroleum, and as the petroleum industry doesn't like oxygen, this makes sense.

Bio-based compounds, by and large, are found in the middle of the chart as they contain both oxygen and hydrogen in varying quantities. The interesting part is that a number of petrochemical intermediates, such as mono-ethylene glycol (MEG), polyethylene terephthalate (PET) or phenol, are also oxygenated even though they originate from a pure hydrocarbon such as ethylene. Oxygenating the hydrocarbon precursor presumably requires some effort which may not be insignificant, and presumably occurs as late as possible in the petrochemical process.

So it is worth asking why you would deoxygenate a biobased molecule (lignin or cellulose, for instance) all the way to ethylene, only to oxygenate it back to, say, PET. (You would also have to hydrogenate to get to ethylene, then dehydrogenate.) The alternative, which the various people making furan-based compounds from glucose have figured out, is simply to shift the short distance directly from the bio-based molecule to the target (for example, glucose to PEF, which is the furanic substitute for PET), avoiding a whole lot of de- and re- oxygenation and hydrogenation which would be required if going all the way to ethylene or another of the six platform hydrocarbon molecules first.

The plot above shows phenol from lignin (the red line), as opposed to phenol from a lignin-derived benzene molecule (the two blue lines). Arguably the total distance between two molecules on the graph is an indication of the relative difficulty of converting one molecule to the other. Of course some reactions are going to be easier than others, and more chemistry will be needed to evaluate exactly what effort is needed, but may I suggest that the following equation, which uses Pythagoras' Theorem to calculate the distance between two points in a plane, would be a very quick and dirty first pass at flagging complicated chemistry, with a large value of D an indication of potentially complex chemistry:

where H and O are the hydrogen and oxygen coordinates, respectively, of the bio-based and petroleum-based molecules. So in the case above, we have the following:

D[SW lignin to Phenol] = 0.0965 (red line)
D[SW lignin to Benzene] + D[Benzene to Phenol] = 0.2669 + 0.1707 = 0.4376 (two blue lines)

Both approaches lead from SW lignin to phenol, but the first path is shorter by a factor of about 4.5, and (obviously!) must be a lot simpler. Certainly, for the non-chemist managerial type such as myself, it would be a flag to highlight when more questions need to be asked of the chemists.

So a question for the chemists out there: while admitting that this approach is crude, is it a sensible first pass at estimating complexity of a chemical transformation? If not, why not? Discuss amongst yourselves and report back to the plenary session.

2017 World Congress on Industrial Biotechnology

Overview

Held in Montreal on July 23-26, 2017, this was the 14th edition of this well-respected conference. Having attended many of the past editions, I can say that the material presented mirrors the maturity of the bio-chemicals industry well, and that progress towards the current level of maturity has been quite rapid over the last decade. A lot of hard work has gotten the industry to the point where it is actually making and selling commercial bio-polymers and chemicals, rather than just talking about biofuels as was the case just 5 years ago.

With 6 parallel tracks, it was impossible to cover everything. I focused on the renewable chemicals and bio-based materials sessions.

Renewable Chemicals

Several presentations focused on various furanic molecules as substitutes for aromatic molecules. Chief among these was furan dicarboxylic acid (FDCA) which can be converted to polyethylene furanoate (PEF), a substitute for petroleum-based polyethylene terephthalate (PET) in the making of clear plastic bottles.

In the same session, Stephen Roest of Corbion described Corbion's pathway, via a proprietary micro-organism, from C6 sugars to FDCA and eventually PEF. Jesper van Berkel of Synvina (a JV between Avantium and BASF) described their catalytic process to the same PEF molecule. Michael Saltzberg of Dupont described a similar catalytic pathway to a methyl ester of FDCA, called FDME. All three presenters boasted about significantly improved oxygen and carbon dioxide barrier properties when compared to the petroleum-based PET route; improved mechanical properties were also cited. These properties allow for thinner bottles or flexible food packaging, partly offsetting higher costs per kilogram, or a cheaper plastic substitute for cans or glass in high-performance applications. All claimed acceptable recyclability when mixed with PET streams, at least at low addition levels. Only Saltzberg commented on colour as a potential issue, claiming that Dupont's FDME is more stable in shipping and storage than FDCA, and thus less likely to yellow. All are planning or building commercial plants in the kilo-tonne per year range. (Corbion has the advantage of being able to swing a lactic acid plant to FDCA and back with relative ease.) Finally, REACH registration for FDCA is complete in Europe while various EPA registration processes are underway. Essentially this market is growing towards maturity, and we are well beyond the test-tube phase.

Succinic acid was also on the agenda, with mentions made of Succinity (the Corbion-BASF JV) and a presentation by Marcel Lubben, CEO of Reverdia (a JV between Roquette and DSM). Bio-Amber. was also represented; see the section on the Sarnia Cluster below. This is another product which is reaching commercial maturity, although perhaps not as fast as some of the players might wish.

Other biomass-to-plastics processes were described by Don Wardius and Natalie Bittner of Covestro. This spin-off of Bayer's polyurethane and polycarbonate business has developed proprietary pathways from sugars to phenol and aniline, two major inputs into their production processes. The bio-conversion pathways were not described in any detail, so it isn't really possible to evaluate yields or costs, but the fact that they are willing to discuss this in public is revealing of the company's focus and intent.

In a separate session on scale-up issues, Cecil Massie of AMEC Foster Wheeler outlined six common mistakes to avoid. They are:

Pilot a continuous process to verify how an upset in the first unit operation is handled in subsequent units. This could be problematic.
In biological processes running in very dilute conditions, the water balance will be critical. Water recovery and recycle may require expensive equipment.
Heat integration and energy conservation are as important as water balances.
Steady state is one thing but knowing how to deal with upset conditions, especially startup and shutdown conditions, is critical. Cold starts in particular may require a lot more steam than at steady state, especially in cold climates.
An early and iterative Hazop process will identify problems early. For instance, reducing risks of a spill or fire in chemical plants often requires reducing the volume of surge tanks; however this leads to downstream control problems if an upstream unit has an upset, and teh surge tank only has a few minutes retention time.
The actual commercial plant may be different from the pilot (e.g. filter press instead of centrifuge). Anticipate the data needed to evaluate a true commercial plant.

Drop-in versus novel bio-based molecules

One recurring theme was the different challenges and opportunities for a drop-in molecule versus a novel molecule with new properties. The drop-in will fit into existing value chains and production facilities as it will be essentially identical in molecular structure and performance to the petroleum incumbent; the downside is that you can't really command a premium price merely on the basis of being bio-based, so it must be cost-competitive with the petroleum incumbent. This is the downfall of bio-fuels in the absence of robust government incentives (carbon taxes, renewable fuel standards, etc.)

In comparison, a molecule such as PEF, which is claimed to perform better than PET, can claim a premium price for that performance. However, downstream processes to use the molecule are not necessarily well developed, and the end-of-life issues such as recyclability or bio-degradeability may be problematic, so more work may be required to achieve significant market penetration.

A third option is to use the bio-based material as-is at partial substitution rates. One good example is the use of lignin from kraft pulp mills as a partial substitute for phenol formaldehyde resins in the making of engineered wood products such as plywood. The lignin works well up to about 50% substitution, where failure in tension starts to occur in the glue line and not the wood; current substitution rates are quite a bit lower than this due to lower reactivity leading to increased cure times. However, unmodified lignin is cost-competitive with petroleum-based resins in this application, as long as plywood production rates are not affected by longer cure times.

There are good examples of each of these pathways becoming commercially viable, however, so it looks like there will be different approaches for different applications.

The pulp and paper sector carries on, but stays under the radar

Representatives of CelluForce (Richard Berry), Domtar (Bruno Marcoccia) and Kruger (Balazs Tolnai) described various activities in which their firms are engaged, in the fields of novel chemicals and fibres. Progress continues and while forest sector players don't feel the need to trumpet success the way the emerging bio-products producers do, it doesn't mean nothing is going on. Celluforce is planning a new 10 t/d CNC plant (up from the current 1 t/d capacity). Domtar's lignin plant in Plymouth, NC, is sold out and they are making a plastic film made of polyethylene and 25% lignin for farm use. Kruger is making cellulose filaments for improved paper properties, polymer composites and concrete and mortar applications.

The Sarnia Cluster

Sandy Marshall of Bio-industrial Innovation Canada chaired a session that explored the synergies available to the bio-industry in the Sarnia area. Dave Park of the Cellulosic Sugar Producers Cooperative (CSPC) described the incentive for farmers, not only to supply a cellulosic sugar plant with crop residues (corn stover and wheat straw), but also to participate in the financing and operation of the plant. In this context, Andrew Richard of Comet Biorefinery described the rationale for creating a joint venture between Comet, a technology provider, and CSPC, a feedstock provider. The first Comet plant in Sarnia, which is in the planning stage, will consume 60 kT/y of crop residues and produce approximately 27 kT (60 million pounds) of 95% pure dextrose for biochemical production. The by-product, a mix of C5 sugars and lignin, will be sold as a binder to the animal feed industry. Mike Hartmann of Bio-Amber, the eventual purchaser of Comet's cellulosic sugar, described their model for succinic acid: while it is a drop-in and chemically identical to petroleum-based succinic acid, it is cheaper and cost-competitive in North America with oil at $25/bbl. (Bio-Amber has exported to China, where it is competitive with oil at $45/bbl after transportation costs, tariffs and VAT of 17% are included.)

Bio-Amber's Sarnia plant, at 30 kT/y, is the biggest succinic acid plant worldwide, regardless of feed (bio or oil). Nonetheless, competing with an entrenched supply chain is challenging and proving the performance of the bio-based version to customers is time-consuming. However, once a customer has been signed, there is little incentive for the customer to leave due to the environmental benefits and lower cost.

The current uses of the product, as laid out in the classic 2004 DOE report on top chemicals from biomass, are to replace adipic acid and a variety of esters, or for conversion to butanediol (BDO) and terahydrofuran (THF). Bio-Amber's growth plans include building a second plant making 100 kT/y of BDO and THF and an additional 200 kT/y of succinic acid. Meanwhile, demand continues to grow, although more slowly than hoped for; the Sarnia plant has been run at 75% of capacity, but overall sales (to over 50 companies today) are running at 35% of capacity. So although not as fast as hoped for, the growth of the succinic acid market is positive and ranks with growth of the furanic markets described above.

The Redefinery program and the Ports of Amsterdam, Rotterdam and Antwerp

In the fall of 2016 I visited a number of groups involved in a radical and ambitious program to build biorefinery plants, adjacent not to the biomass supply but to the end-user, i.e. the petro-chemical industry. This program, loosely called the Redefinery program and mirroring the Sarnia Cluster described above, is being led by industrial players in Holland, Flanders and Rhine-Westfalia, financed by these governments and by the EU, and supported by a range of R&D providers such as TNO, ECN, VITO and various local universities including Delft, Wageningen and KU Leuven. Portions of this project were described by representatives of Biorizon (Joop Groen) and Bio-Based Delta (Willem Sederel). While Canadian pulp and paper industry players might argue a pulp mill is an excellent site for a biorefinery, with existing steam and effluent treatment plants, logistics for biomass supply and product shipment, all on an existing industrial site with all the required permits in place, the same can be said of the Port of Rotterdam once you realise that several million tonnes of wood pellets flow through the port annually. Essentially the background, provided by Willem Sederel and Joop Groen, starts with a strong political desire to see a bio-chemical industry arise in the Port of Rotterdam or elsewhere in the well-integrated coastal chemical parks around the ports of Antwerp, Rotterdam and Amsterdam.

Biomass availability in and around the Port of Rotterdam consists mainly of wood pellets, as much as 8 Mt/y by 2020, imported for use in coal-fired power plants. In particular, the wood to sugar pathway, leading to organic acids such as succinic acid, lactic acid, FDCA and others, has been identified as critical by players such as Corbion, Avantium and others. The Dutch industry players have been smart enough to realize the value of bio-aromatics from lignin to the economics of second-generation sugars, and are aware that new uses for lignin will be critical if the path to sugars is to be profitable. They are spending large sums in Dutch and Flemish R&D facilities to quickly get up to speed on lignin properties and transformation processes. (The importance of bio-aromatics to the petrochemical industry is such that pathways from sugars to aromatic chemicals are also of interest.) The objective is a 0.25 to 1 Mt/y plant making sugars and lignin, probably via steam explosion, by 2019, and multiple plants shortly thereafter; budgets are up to €3.7B in industrial and EU funds over the period 2014-2020. The key players driving this very aggressive agenda are:

The Port of Rotterdam, which has created a new 80-hectare site on reclaimed land to host bio-chemicals producers;
RWE, who would look after pellet supply and logistics;
Corbion, who would take C6 sugars;
The successor to Abengoa, who would take C5 sugars for ethanol;
Coal-fired power producers, who would initially burn lignin for fuel.

So where will the world's first million-tonne wood-based biorefinery be located? In my previous life in the Canadian pulp and paper industry, I would have hoped that a Canadian pulp mill would host (if not build and operate) such a plant; given wood supply issues, my second guess would have been Brazil. But the speed at which the Dutch are moving is such that I am now prepared to bet on Rotterdam (although the 2019 timeline is probably excessively optimistic). Stay tuned! The next World Bio-Congress will be in Philadelphia July 16-19, 2018.

Monday, May 29, 2017

Big Data: Getting information from data

Big Data is all around us. But there are pitfalls in converting this growing avalanche of data into useful information.

Two decades ago, we saw the early introduction of data collection and storage systems into industrial settings. My own experience with these systems, such as the PI data historian system from OSIsoft, was in the pulp and paper industry, but the prevalence of these systems crosses sector boundaries. The purpose was to look at trends in all the different sensor readings, control actions and actuator settings in an industrial process, in order to attempt to draw conclusions from the accumulated machine records that might not be immediately obvious to operators or technical staff. These conclusions could then presumably be used to reduce operating costs or improve product quality.

Growth of these systems tracked the decrease in the cost of computer data storage systems. With memory chips and hard drives being expensive early in the computer era, operating data was often overwritten daily if not hourly. A study I ran in the late 1990's [see reference below] had access to 36 months worth of industrial data, collected daily by a mill engineer and averaged monthly. So 36 data points to describe 3 years of operations ... today, a years' worth of industrial process data from a full-scale plant amounts to terabytes of data, but can be stored forever at a cost of a few dollars.

In parallel with the rise of data historians and archiving systems, we saw the growth of analytical processes to try to make sense of this new avalanche of data. A quick Google search yields an enormous list of textbooks and articles on the topic of data mining and so-called 'statistical learning'; some are even sold by Amazon as well as publishers such as Springer-Verlag or Wiley.

Figure 1: Data from Nature, as used to illustrate correlation versus cause and effect by the late Prof. Martin Weber, Dep't of Chemical Engineering, McGill University, ca. 1990.

The problems that I have run into in a couple of cases are due to the data analyst not properly understanding the process in question, and coming up with recommendations which neglect underlying reasons (technical, economic, product quality, environmental permit constraints, etc.) for operating a particular mill in a particular fashion. In the worst cases, the old issue of correlation versus cause and effect rears its ugly head.

Figure 2: In this correlation, each stork is associated with approximately one live birth per working day (5 days per week, 50 weeks per year).

In one case, data mining applied to an industrial data set was used to show that the value of a particular variable tended to swing over a very wide range. No good reason could be identified from the data as to why the variable should move around so much. A mild correlation with certain operating inputs (feedstock, water, energy, chemicals) was used to suggest that reducing the variability in this particular variable would help improve overall operating costs.

In fact, the variable in question was manipulated by a control algorithm, with the objective being to maintain minimal product quality requirements necessary to satisfy customers in the face of raw material variability, while using the least amount of raw materials and process inputs necessary to do so. Reduced quality meant lost or highly discounted sales and had a much bigger impact on the bottom line than the slight increase in operating costs required on occasion to maintain product quality in the face of the inevitable disturbances. The quality-related variables were not flagged by the data mining process, since they did not move at all and in fact were uncorrelated with any other variable. This was, in fact, a good thing, as it proved the existence of a well-designed control algorithm: yes, the manipulated variable moves around a lot, but the result is that the controlled variable (product quality) is stable.

So this is an example where faulty information was extracted from big data, and shows the importance of understanding the difference between industrial data (where a controller or an operator may have his or her hand on a valve) versus lab-generated data (which covers every possible combination of all variables, even combinations that will never be implemented industrially). The article referenced above provides another example: mill data showed conclusively that using less bleach led to brighter paper, a correlation which is clearly nonsense on its own, but which makes sense in context.

One thing I have learned from helping to develop process control systems is this: it is very hard to find a good process control person. The best approach is to find a good control person (of whom there are many), and pair him with a good process person (also lots of good people out there). I would suggest the same is true of any of the broad new tools for data analysis (mining, process modeling and integration, etc.): there are lots of analytics people out there, but for best effect, they need to be paired with a process expert who understands the underlying chemistry and physics, as well as the business context.

I have focused here on industrial challenges. Today we are seeing data mining and analysis tools applied to any sphere where lots of data exists. For instance, Google (click here) collects staggering amounts of data every second (including data about who reads this blog, and what else those readers might be interested in), and presumably is building analytical systems to extract information about the world from all this data. (Should we worry that Google knows so much about us? For discussion outside this post...)

So while this post has discussed data in an industrial setting, perhaps it is worth asking if the same challenges exist in the social or other applications of data analysis, and if so, how are they being managed? Big Questions around Big Data.

Reference: Browne, T.C., Miles, K.B., McDonald, J.D. and Wood, J.R., “Multivariate Analysis of Seasonal Pulp Quality Variations in a TMP Mill”, Pulp Pap Can 105(10):35-39 (October 2004).

Posts sorted by label