Crowdfunding: An Update

What’s the current state of the crowdfunding? Where are we, when it comes to our goals? What work has been done already? These are the question this article tries to answer.

TL;DR

  • We did a lot of security updates
  • Work on removing ActiveMQ is done
  • Work on two of the three Solr indexes is done
  • We have problems with the Spring update

ActiveMQ

github.com/opencast/opencast/pull/3100

The goal: Installing Opencast may be somewhat confusing to new users, partly because there are lots of different additional services to run. For a long time, one of them has been ActiveMQ which is a message broker used for inter-service communication in Opencast. Used… well… barely used, actually. With recent versions, we only needed ActiveMQ on a single server only. Since ActiveMQ is meant to distribute information across multiple servers, this meant we could also communicate with these services directly. In short, less overhead and fewer additional services to run for adopters. That is why our goal was to entirely remove Opencast’s dependency on ActiveMQ.

Current state: Work on this task has been mostly finished. A pull request removing ActiveMQ has been filed and reviewed. All that is left is a bit of cleanup work before it can be merged. This means that this is almost guaranteed to make it into the next major Opencast release.

Security Issues

github.com/opencast/opencast/security/advisories/GHSA-hcxx-mp6g-6gr9
github.com/opencast/opencast/security/advisories/GHSA-j4mm-7pj3-jf7v
github.com/opencast/opencast/security/advisories/GHSA-59g4-hpg3-3gcp
github.com/opencast/opencast/security/advisories/GHSA-mf4f-j588-5xm8

The goal: Opencast has a good track record of identifying and fixing security issues, and we had identified a few known or potential security issues we wanted to evaluate and fix, if they turned out to be problematic. That way we can keep our servers safe and avoid any spectacular data breaches.

Current state: There have been a number of security fixes for Opencast 9, 10 and 11. The issues we addressed range from limited data extraction, over privilege escalation to potential remote code execution attacks. Fixes for these have been included in the last couple of releases. We have also been able to dismiss a few reports of code we suspected to be problematic which turned out not to be a problem after all. Still, we have not yet processed the whole list of suspects. We will inform you, as usual, if we release another security patch and will keep trying to make these releases as responsible and painless as possible for adopters.

Log4j: We cannot talk about security fixes without pointing out one particular problem we faced as part of the crowdfunding. The Log4Shell remote code execution vulnerability and several additional vulnerabilities found in this library after the world’s security researchers all turned their attention towards Log4j have affected Opencast as well. We released several versions of Opencast in December to address these issues as fast as we could, since we knew that these vulnerabilities were actively exploited. To help adopters, we even decided to release new versions of Opencast 9 since it only just reached its end of life, and we knew about many adopters not having updated yet.

Solr

github.com/opencast/opencast/pull/3204
github.com/opencast/opencast/pull/3376
github.com/opencast/opencast/pull/3377

The goal: Opencast uses both Solr and Elasticsearch for full text search and caching. Both services serve an almost identical purpose. However, one of them is in desperate need of attention: Solr. We built an integration with Solr using an older version, which is both too old to easily deploy in a cluster, and not easy to update. In short, things have to change. But instead of updating Solr and still end up with two different services doing the same thing, we chose to consolidate on Elasticsearch¹.

Current state: Opencast uses Solr for three services: The series service, the workflow service and the search service. All of these services were user-facing in Opencast (Matterhorn) 1.x, which is why full text search and caching was important. The same is no longer true today, and thus the need for some of these indexes no longer exists.

We were able to completely remove two of the three Solr indexes, sparing adopters from re-indexing these ever again. The services this was done for are the workflow service and the series service. In the future, data will be requested from that database directly. The patches for these are currently being reviewed. We hope to get these merged soon to have them included in the next major new version of Opencast.

Work on the final service, the search service, is more complex and not yet done. We cannot remove Solr in the same manner, since full text search capabilities are actually used here and the service is still user-facing, being the back-end for the players among other things. We hope to still be able to make the shift to Elasticsearch for Opencast 12, but this is more challenging, and we will act with caution since it’s a central piece of Opencast infrastructure being used by all adopters.

[1] We may actually use OpenSearch instead of Elasticsearch, but that should be a drop-in replacement. We will report if it actually is. But for sake of simplicity, we stick to Elasticsearch for reporting.

Spring

The goal: Opencast uses Spring Security for handling logins and access control. We did fall behind when it comes to updating the library to its current state and are now using a version which is no longer supported. While this does still work just fine, unfortunately, like we have seen with Log4j, this bears the risk of suddenly blowing up. Thus we would like to update.

Current state: Our plan was to separate the different login mechanisms Opencast supports which are all woven into Spring Security, then start updating the core and basic login mechanisms first. At that point, we wanted to discuss further actions with the community.

Sadly, it turned out that this plan is not as easy as we hoped. Newer versions of Spring Security do not work well with our OSGi stack, and just updating even the core is not possible. Options we are now evaluating are investigating versions picked up by the Eclipse Gemini and Apache ServiceMix projects, which still provide supported versions but not the latest versions, and the possibility of support within Karaf itself which has been hinted at for the next major version but has not yet been confirmed.

Due to the not yet finalized statements about the Karaf roadmap, we decided to focus on the other tasks first, leaving this as the last potential task to tackle. The exact form of how we can/will tackle this problem and if we can completely fix this in this crowdfunding is still to be determined. We will make sure to start an open discussion about this once we have collected all information.

Questions; Next Steps

If you have any questions or want to discuss any of these tasks, don’t hesitate to bring this to the development mailing list, the Matrix chat or bring it up in the weekly technical meetings. Furthermore, if you want to help, consider reviewing any of the open pull requests linked above.

We will post again, once we have reached a new major milestone. Additionally, we will submit a session about the state of the crowdfunding at the upcoming conference. Join us there for a discussion, if you are interested.

Opencast Crowdfunding 2021

As part of this year’s crowdfunding, the Opencast board in cooperation with the group of Opencast committers and some commercial partners would like to raise money to address some underlying infrastructure, security and performance problems.

These funds will help us to future-proof Opencast, make it easier to operate and deploy, and ensure the security standard we all expect.

We are actively seeking a discussion about the technical requirements related to these changes, in particular with the group of committers participating in the technical meetings. This is something we would like to continue while working on the issues.

These proposed changes affect many core Opencast components, and as such all time estimates are subject to change. We have tried to be as accurate as possible and will strive to be very transparent what we spent time on.

Spring update

Spring Security is a powerful and highly customizable authentication and access-control framework used by Opencast to handle logins and to decide if, and how users may access different parts of the application.

The version of Spring we are currently using is out of date, and in need of an update. This will ensure Opencast’s core security layer is up-to-date, as well as resolving a few outstanding bugs.

Unfortunately, Spring closely ties into several authentication methods for Opencast (LDAP, CAS, Shibboleth, …) causing a ripple-effect of components which need to be updated and/or replaced.

Addressing the problem

To go forward, we propose to update Spring, and deactivate all external authentication methods. This first step allows us to ensure that the Opencast core authentication method works fine and does not cause problems or side effects in its shiny, new, updated form.

Once that is done, we would like to evaluate how best to continue with the external authentication components. Based on the effort we estimate for updating the modules, we see two ways of going forward and would like to decide on one of them in an open discussion:

  1. Continue as before, updating all the necessary authentication modules. This, of course, is the way we prefer if it turns out that updates seem to be easy. This will require extensive community involvement to ensure the various integrations continue to work.
  2. Only keep some basic internal authentication mechanisms and use an authentication proxy or an external plugin to provide different authentication options. This is something often used in modern applications (e.g. see Grafana) and also something which e.g. the new video portal will be using. Of course, we would provide examples for the most common authentication methods like LDAP.

We invite everyone to participate in this discussion once it is time for the decision to be made. Watch the users/dev list for that discusion.

Whatever we do, this is certainly a larger change to the internal infrastructure of Opencast and as such, we would like to not rush the change. This means we would target Opencast 12, instead of the upcoming release of Opencast 11. This allows us to detect potential issues early and also allows adopters to test and adapt their integrations if required.

We estimate approximately one to two weeks to update the core and investigate possible further steps to fuel a discussion, with five to six weeks of work time for this task if we need to update all the integrations.

Search index update

At some point as an Opencast administrator, you are almost guaranteed to have to touch one or the other of Opencast’s search indexes and so may have asked yourself, what are these indexes used for and why are there so many?

The answer is complicated and boils down to history. Some aren’t actually necessary any longer, having been created for front-ends which have long since been removed. Others are still active. As a committer group we have long been aware of the state of the indexes, especially the different types (Solr vs Elasticsearch), but “it still works” and thus is hard to justify spent time removing them.

Still, people have often faced problems involving indexes, especially over the last two years when systems grew bigger and bigger. This means that that saying “index update” can literally mean a bad day, or week for someone.

This is what we would like to address with three main goals in mind:

  1. Remove the unnecessary indexes
  2. Unify the ecosystem, consolidating all remaining indexes to use one system (likely OpenSearch)
  3. Fix major performance pain points in the infrastructure

As with the previous project, we would like to be transparent and keep the community informed about what we do, while we do it. Fortunately, for this, the tasks are more obvious. Here is a list of Opencast components and what there is to do:

  • The Search Service is one component still using the old Solr infrastructure. This index, in particular, is heavily used and heavily depends on full-text search. This means we need to update this to the new search index infrastructure while keeping backwards compatibility as good as we can.
  • The Workflow Service still contains an old Solr index which can likely be removed. No internal components require full-text search any longer. One point open for discussion is, however, the current integration in the API which might be effected in some very specific sub-queries.
  • The Series Service is also using an old Solr index. It is not used internally and can probably be removed. For external applications, the same information are available via the external API and its search index.
  • The Index Service is the heart of the Elasticsearch based services. Its problem is that instead of the well-supported high-level API, it still includes and modifies Elasticsearch internally. We should move to the new API there. This also allows us to address some known performance problems, potentially reducing the system load caused by the admin interface and/or the External API. The Index Service is used by the Admin Interface API and the External API, but we believe that only an update of the central component is necessary.

To fix these issues, we estimate a workload of about 7 weeks.

Replace/Remove ActiveMQ

Apache ActiveMQ is used to send messages between components in Opencast. Its use has been in decline over time and is now at a point where we would like to remove it entirely. This would lead to a smaller system footprint, less complexity and an easier to set-up and maintain system.

The task is relatively straightforward since messages are sent only within the admin node and not to external servers or services. This means what we need to do is to find these communication channels and decide how to wire them together instead. OSGi should help with this.

We estimate that this work will take about two weeks of work time to finish.

Security Fixes

These should need no further explaination, but if you still need one: Opencast is a public, HTTP based service with a large attack area. Attackers come up with more and more ways of getting into systems and it is our duty to keep them out. This means we need to keep our defenses up and fix even potential problems before it is too late.

Opencast has a good track record of identifying and fixing potential problems and we would like to continue this by investigating some potential problems we have identified and, of course, fix them if these turn out to be a problem.

We estimate about two weeks of work time to get this done.

Roadmap

We would like to start mid-October. In particular, we would like to get the first part of the Spring update done as fast as possible, since we need a community discussion before we start with the second half.

We would like to start with:

  • Spring Security core update
  • Remove indexes we do not need to replace (series, workflows)
  • Remove ActiveMQ
  • Security updates

Who is doing the work

We have talked to the commercial partners in the Opencast space and have identified two organizations willing to cooperate on this:

  • Loganite Inc
  • ELAN e.V.

Want to help?

If you want to participate, please contact the head of the Opencast Board, Olaf Schulte. He will discuss the best way for going forward. Contributions can be made either as contracts directly with the participating companies, or via the Apereo Foundation. Talk to Olaf if you have additional requirements or constraints.

Due to the amount of work, the execution and contribution of the code will start in 2021 but continue and be finished in 2022. Opencast 12 is our focus for the bulk of the work.