As part of this year’s crowdfunding, the Opencast board in cooperation with the group of Opencast committers and some commercial partners would like to raise money to address some underlying infrastructure, security and performance problems.
These funds will help us to future-proof Opencast, make it easier to operate and deploy, and ensure the security standard we all expect.
We are actively seeking a discussion about the technical requirements related to these changes, in particular with the group of committers participating in the technical meetings. This is something we would like to continue while working on the issues.
These proposed changes affect many core Opencast components, and as such all time estimates are subject to change. We have tried to be as accurate as possible and will strive to be very transparent what we spent time on.
Spring Security is a powerful and highly customizable authentication and access-control framework used by Opencast to handle logins and to decide if, and how users may access different parts of the application.
The version of Spring we are currently using is out of date, and in need of an update. This will ensure Opencast’s core security layer is up-to-date, as well as resolving a few outstanding bugs.
Unfortunately, Spring closely ties into several authentication methods for Opencast (LDAP, CAS, Shibboleth, …) causing a ripple-effect of components which need to be updated and/or replaced.
Addressing the problem
To go forward, we propose to update Spring, and deactivate all external authentication methods. This first step allows us to ensure that the Opencast core authentication method works fine and does not cause problems or side effects in its shiny, new, updated form.
Once that is done, we would like to evaluate how best to continue with the external authentication components. Based on the effort we estimate for updating the modules, we see two ways of going forward and would like to decide on one of them in an open discussion:
- Continue as before, updating all the necessary authentication modules. This, of course, is the way we prefer if it turns out that updates seem to be easy. This will require extensive community involvement to ensure the various integrations continue to work.
- Only keep some basic internal authentication mechanisms and use an authentication proxy or an external plugin to provide different authentication options. This is something often used in modern applications (e.g. see Grafana) and also something which e.g. the new video portal will be using. Of course, we would provide examples for the most common authentication methods like LDAP.
We invite everyone to participate in this discussion once it is time for the decision to be made. Watch the users/dev list for that discusion.
Whatever we do, this is certainly a larger change to the internal infrastructure of Opencast and as such, we would like to not rush the change. This means we would target Opencast 12, instead of the upcoming release of Opencast 11. This allows us to detect potential issues early and also allows adopters to test and adapt their integrations if required.
We estimate approximately one to two weeks to update the core and investigate possible further steps to fuel a discussion, with five to six weeks of work time for this task if we need to update all the integrations.
Search index update
At some point as an Opencast administrator, you are almost guaranteed to have to touch one or the other of Opencast’s search indexes and so may have asked yourself, what are these indexes used for and why are there so many?
The answer is complicated and boils down to history. Some aren’t actually necessary any longer, having been created for front-ends which have long since been removed. Others are still active. As a committer group we have long been aware of the state of the indexes, especially the different types (Solr vs Elasticsearch), but “it still works” and thus is hard to justify spent time removing them.
Still, people have often faced problems involving indexes, especially over the last two years when systems grew bigger and bigger. This means that that saying “index update” can literally mean a bad day, or week for someone.
This is what we would like to address with three main goals in mind:
- Remove the unnecessary indexes
- Unify the ecosystem, consolidating all remaining indexes to use one system (likely OpenSearch)
- Fix major performance pain points in the infrastructure
As with the previous project, we would like to be transparent and keep the community informed about what we do, while we do it. Fortunately, for this, the tasks are more obvious. Here is a list of Opencast components and what there is to do:
- The Search Service is one component still using the old Solr infrastructure. This index, in particular, is heavily used and heavily depends on full-text search. This means we need to update this to the new search index infrastructure while keeping backwards compatibility as good as we can.
- The Workflow Service still contains an old Solr index which can likely be removed. No internal components require full-text search any longer. One point open for discussion is, however, the current integration in the API which might be effected in some very specific sub-queries.
- The Series Service is also using an old Solr index. It is not used internally and can probably be removed. For external applications, the same information are available via the external API and its search index.
- The Index Service is the heart of the Elasticsearch based services. Its problem is that instead of the well-supported high-level API, it still includes and modifies Elasticsearch internally. We should move to the new API there. This also allows us to address some known performance problems, potentially reducing the system load caused by the admin interface and/or the External API. The Index Service is used by the Admin Interface API and the External API, but we believe that only an update of the central component is necessary.
To fix these issues, we estimate a workload of about 7 weeks.
Apache ActiveMQ is used to send messages between components in Opencast. Its use has been in decline over time and is now at a point where we would like to remove it entirely. This would lead to a smaller system footprint, less complexity and an easier to set-up and maintain system.
The task is relatively straightforward since messages are sent only within the admin node and not to external servers or services. This means what we need to do is to find these communication channels and decide how to wire them together instead. OSGi should help with this.
We estimate that this work will take about two weeks of work time to finish.
These should need no further explaination, but if you still need one: Opencast is a public, HTTP based service with a large attack area. Attackers come up with more and more ways of getting into systems and it is our duty to keep them out. This means we need to keep our defenses up and fix even potential problems before it is too late.
Opencast has a good track record of identifying and fixing potential problems and we would like to continue this by investigating some potential problems we have identified and, of course, fix them if these turn out to be a problem.
We estimate about two weeks of work time to get this done.
We would like to start mid-October. In particular, we would like to get the first part of the Spring update done as fast as possible, since we need a community discussion before we start with the second half.
We would like to start with:
- Spring Security core update
- Remove indexes we do not need to replace (series, workflows)
- Remove ActiveMQ
- Security updates
Who is doing the work
We have talked to the commercial partners in the Opencast space and have identified two organizations willing to cooperate on this:
Want to help?
If you want to participate, please contact the head of the Opencast Board, Olaf Schulte. He will discuss the best way for going forward. Contributions can be made either as contracts directly with the participating companies, or via the Apereo Foundation. Talk to Olaf if you have additional requirements or constraints.
Due to the amount of work, the execution and contribution of the code will start in 2021 but continue and be finished in 2022. Opencast 12 is our focus for the bulk of the work.